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1.0  Introduction 


Stochastic  Resonance  (SR)  is  a  nonlinear  phenomenon  first  reported  and  analyzed  in  [Benzi 
1981]  in  terms  of  a  nonlinear  dynamic  effect.  Since  then,  it  was  proposed  [Benzi  1982]  [Nicolis] 
to  explain  the  observed  periodic  occurrences  of  the  earth’s  ice  ages  using  a  two  state  system 
denoting  the  earth’s  climate  at  present  and  during  the  ice  age.  Variations  in  the  absorption  and 
reflectance  of  incident  solar  energy  due  to  changing  weather  conditions  constituted  the  system 
‘noise’.  The  weak  periodic  ‘signal’  consisted  of  variations  of  incident  solar  energy  due  to 
periodic  eccentricity  in  the  earth’s  orbit.  Since  then,  considerable  research  efforts  have  assessed 
the  effect  in  a  wide  range  of  applications  including  audio  systems  [Lipshitz],  neural  networks 
[Lindner],  hyperspectral  imaging  [Chiang],  neuroscience  [Kosko],  medical  imaging  [Muller], 
visual  perception  [Simonotto],  more  recently  in  tactical  surveillance  [Repperger],  as  well  as 
applications  cited  in  the  reference  section. 

The  classic  SR  signature  is  the  signal-to-noise  ratio  (SNR)  gain  of  certain  nonlinear 
systems;  i.e.,  the  output  SNR  is  significantly  higher  than  the  input  SNR  when  an  appropriate 
amount  of  noise  is  added.  This  ratio  reflects  the  gain  achieved  by  the  processing  procedure. 
These  considerations  are  treated  in  references  [3]  -  [17]  of  [Chen,  et.  al.,  2006b].  Some 
approaches  have  been  proposed  to  tune  the  SR  system  by  maximizing  SNR.  It  has  been  shown 
that  the  SNR  of  a  summing  network  of  excitable  units  is  optimum  at  a  certain  level  of  noise 
[Collins].  Later,  for  some  SR  systems,  robustness  enhancement  using  non-Gaussian  noise  was 
reported  in  [Castro,  et.  al.].  For  a  fixed  type  of  noise,  Mitaim  and  Kosko  [Mitaim,  1998] 
proposed  an  adaptive  stochastic  learning  scheme  performing  a  stochastic  gradient  ascent  on  the 
SNR  to  determine  the  optimal  noise  level  based  on  the  samples  from  the  process.  Rather  than 
adjusting  the  input  noise  level,  [Xu,  et.  al.]  proposed  a  numerical  method  for  realizing  SR  by 
tuning  system  parameters  to  maximize  SNR  gain.  Although  SNR  is  a  very  important  measure  of 
system  performance,  SR  approaches  based  on  SNR  gain  have  several  limitations.  Specifically, 
SNR  characterizes  only  the  second  order  terms  of  the  processes;  i.e.,  the  signal  and  noise 
variance.  First,  the  definition  of  SNR  is  not  uniform  and,  in  fact,  varies  from  one  application  to 
another.  Second,  to  optimize  the  performance,  the  complete  a  priori  knowledge  of  the  signal  is 
required.  Finally,  for  detection  problems  where  the  noise  is  non-Gaussian,  higher  order  terms 
may  play  a  role  and  optimizing  output  SNR  does  not  guarantee  optimizing  probability  of 
detection. 

SR  was  also  found  to  enhance  the  mutual  information  (MI)  between  input  and  output 
signals  [Godivier],  [Goychuk],  [Stocks],  [Kosko  2003,  2004],  [Mitaim  2004],  Similar  to  the 
SNR  scenario,  for  a  specified  type  of  SR  noise,  [Mitaim  2004]  showed  that  almost  all  noise 
probability  density  functions  produce  some  SR  effect  in  threshold  neurons  and  a  new  statistically 
robust  learning  law  was  proposed  to  find  the  optimal  noise  level.  [McDonnell]  pointed  out  that 
the  capacity  of  a  SR  channel  cannot  exceed  the  actual  capacity  at  the  input.  Compared  to  SNR, 
MI  is  more  directly  correlated  with  the  transferred  input  signal  information. 

In  signal  detection  theory,  SR  also  plays  a  very  important  role  in  improving  the  signal 
detectability.  In  [Asdi]  and  [Zozor  2002],  improvement  of  detection  performance  of  a  weak 
sinusoid  signal  is  reported.  To  detect  a  DC  signal  in  a  Gaussian  mixture  noise  background,  [Kay 
2000]  showed  that  under  certain  conditions,  performance  of  the  sign  detector  can  be  enhanced  by 
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adding  some  white  Gaussian  noise.  For  another  suboptimal  detector,  the  locally  optimum 
detector  (LOD),  [Zozor  2003]  pointed  out  that  detection  performance  is  optimum  when  the  noise 
parameters  and  detection  parameters  are  matched.  A  study  of  the  stochastic  resonance 
phenomenon  in  quantizers  conducted  in  [Saha]  showed  that  improved  detection  performance  can 
be  achieved  by  a  proper  choice  of  the  quantizer  thresholds.  Recently,  [Rousseau]  pointed  out 
that  the  detection  performance  can  be  further  improved  by  using  an  optimal  detector  on  the 
output  signal.  Despite  the  progress  achieved  by  the  above  approaches,  the  study  of  the  SR  effect 
in  signal  detection  systems  is  rather  limited  and  does  not  fully  consider  the  underlying  theory.  In 
[Chen  2006b],  the  underlying  mechanism  of  the  stochastic  resonance  phenomenon  was  explored 
for  a  more  general  two  hypotheses  detection  problem. 

The  type  of  detector  that  lends  itself  to  improved  detection  via  stochastic  resonance  is  one 
that  nonlinearly  processes  the  data.  The  eye  itself  is  a  nonlinear  device  and  so  it  is  conceivable 
and  has  been  demonstrated  empirically  that  improved  visual  detection  is  possible  through  this 
mechanism.  The  important  question  of  what  type  of  noise  to  be  added  has  until  recently  evaded 
a  solution.  In  Phase  I,  this  issue  was  addressed  directly  and  a  fundamental  theoretical  concept 
was  developed  leading  to  a  determination  of  the  optimal  additive  SR  noise  to  achieve  maximum 
probability  of  detection  PD  subject  to  the  constraint  that  the  probability  of  false  alarm  Pfa  is  not 
increased. 

Clearly,  improving  visual  imagery  for  human  visual  perception  will  depend  on  the  type  of 
nonlinearity  that  the  eye  employs.  Ultimately,  we  know  that  it  is  the  brain  that  responds  to  a 
visual  stimulus  causing  neurons  to  fire.  Conceivably  if  we  understood  the  effect  of  the  noise 
PDF  on  detectability  for  the  “eye  detector”,  then  we  could  add  noise  such  that  the  total  noise 
PDF  is  the  most  desirable  one.  This  requires  a  study  of  a.)  the  types  of  noise  PDFs  that  can  be 
obtained  via  convolution  since  adding  noise  random  variables  causes  a  convolution  of  their 
PDFs,  b.)  the  most  desirable  noise  PDF  from  the  standpoint  of  detectability,  and  c.)  how  standard 
detection  theory  relates  to  human  visual  perception. 

We  previously  pointed  out  that  there  are  limitations  of  the  signal-to-noise  ratio  as  the  most 
important  measure  of  human  detectability.  In  fact,  the  SNR  measure  only  characterizes  detection 
in  the  Gaussian  noise  case.  For  non-Gaussian  noise,  SNR  is  only  part  of  the  story  with,  for 
example,  the  intrinsic  accuracy  providing  the  remainder  in  the  independent,  identically 
distributed  (IID)  non-Gaussian  detection  problem.  The  intrinsic  accuracy  is  the  single  sample 
Fisher  information  for  a  DC  level  in  non-Gaussian  IID  noise  [Kay  1998].  Clearly,  these  two 
important  aspects  of  design  are  related  and  so  an  overall  strategy  of  noise  PDF  design  is 
required.  It  was  also  noted  in  the  Phase  I  proposal  that  there  are  also  questions  of  whether  the 
additive  enhancement  noise  should  be  IID  from  pixel  to  pixel.  It  is  possible  that  correlated  noise 
might  be  more  fruitful.  This  is  an  avenue  of  research  that  has  not  been  addressed.  Furthermore, 
multivariate  PDF  design  is  a  difficult  problem,  but  there  has  been  recent  progress  in  this  area 
[Kay  2001],  [Tanner  1993],  [Ruanaidh  1996],  [Gilks  1996].  The  above  considerations  were 
noted,  but  not  fully  considered  in  the  Phase  I  effort.  In  Phase  II,  we  shall  return  to  give  more 
thorough  consideration  to  these  issues  among  others. 

The  establishment  of  an  analytic  framework  for  algorithmic  development  utilizing  stochastic 
resonance  was  a  prime  objective  of  the  Phase  I  effort.  In  order  to  achieve  this  goal,  it  was 
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imperative  to  assess  the  relationship  of  the  non-Gaussian  nature  of  the  noise  processes,  the  non¬ 
linear  aspects  of  the  signal  processing  or  suboptimal  detection  device,  as  well  as  the 
characteristics  of  the  sub-threshold  signals. 

An  important  consideration  here  is  that  the  non-Gaussian  noise  PDF  may  actually  provide  a 
potential  for  improved  detection  performance  provided  that  an  appropriate  detection  strategy  is 
employed.  Evidence  for  this  consideration  has  been  noted  by  [Kay],  [Michels]  in  several 
publications,  although  not  related  to  stochastic  resonance.  Several  of  these  analyses  involved 
partially  correlated  non-Gaussian  noise  processes  in  addition  to  additive  white  Gaussian  noise. 
As  noted  previously,  it  is  possible  that  the  use  of  correlated  noise  may  be  beneficial  in  the 
enhancement  of  the  stochastic  resonance  effect.  Both  the  application  of  correlated  noise  as  well 
as  the  assessment  of  processes  already  containing  such  noise  is  an  open  area  of  research  which 
has  received  little  attention.  It  remains  as  an  important  research  topic  for  the  Phase  II  effort. 

In  Chapter  2,  a  novel  fundamental  theory  addressing  the  optimization  of  stochastic 
resonance  in  detection  theory  is  outlined.  The  development  of  this  theory  was  the  prime 
contribution  of  Phase  I.  Specifically,  this  effort  achieved  the  goal  of  establishing  a  fundamental 
analytical  framework  for  the  application  of  stochastic  resonance  to  detection  from  which  an 
optimization  solution  could  be  obtained.  Chapter  3  provides  an  alternative  presentation 
introducing  the  analytical  framework.  In  Chapter  4,  a  novel  consideration  is  given  to  the 
potential  use  of  an  alternative  decision  statistic  transformation  methodology  to  recover  optimal 
performance  for  a  suboptimal  detector.  Subsequent  subsections  address  the  application  of  the 
analytical  detection  theory  framework  to  suboptimal  detectors  such  as  nonparametric  detectors 
(Chapter  5),  image  enhancement  (Chapter  6),  distributed  fusion  applications  (Chapter  7),  and 
probability  of  error  reduction  (Chapter  8)  with  implications  for  communications  theory.  Finally, 
recommendations  and  future  considerations  are  presented  in  Chapter  9. 
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2.0  A  Fundamental  Detection  Framework  using  Stochastic  Resonance 
2.1  Introduction  to  the  Fundamental  Detection  Framework  using  SR 


This  chapter  summarizes  the  mathematical  framework  developed  during  Phase  I  to  analyze 
the  stochastic  resonance  (SR)  effect  in  binary  hypothesis  testing  problems  [Chen,  et.  al.,  2006a, 
b].  Specifically,  given  an  N  dimensional  data  vector  x  e  i?N,  we  decide  between  two  hypotheses 
H\  and  Ho, 

H0 :  px(x;//o)  =  p0(x) 

H\:  Px(x;//i)  =  pi(x)  (2.1) 

where  po(x)  and  pi(x)  are  the  PDFs  of  x  under  Ho  and  H],  respectively.  The  test  above  can  be 
completely  characterized  by  a  critical  function  (< decision  function)  <j>  where  0  <  <|)(x)  <  lfor  all  x. 
For  any  observation  x,  this  test  chooses  the  hypothesis  H\  with  probability  <j)(x).  In  many  cases, 
<)>(x)  can  be  implicitly  expressed  by  a  test  statistic  T(x)  which  is  a  function  of  x  and  a  threshold  r| 
such  that 

", 

T(x)  >rj  (2.2) 

"o 


where  its  corresponding  critical  function  is 


n 


a 


1° 


m>ri 

m=rj 

m<rj 


(2.3) 


and  0  <  a  <  1  is  a  suitable  number.  The  probability  of  detection  Pd  is  now  given  by 

Pd  =  l,J(x)p[(x)dx  (2.4) 

and  the  probability  of  false  alarm  PFa  is  given  by 

C  =  |,»«*WA(x)rfx  <2.5) 

where  the  superscripts  on  Pd  and  Pfa  in  (2.4)  and  (2.5)  indicate  that  the  test  in  (2.2)  is 
employed  for  the  data  vector  x.  Although  the  critical  function  cj)(x)  and  test  statistic  T(x)  can  take 
any  form,  the  optimum  Neyman-Pearson  test  involves  the  likelihood  ratio  test  (LRT)  where 
7lrt(x)  =  pi(x)/p0(x).  Although  this  test  provides  optimal  PD  subject  to  the  constraint  that  PFa  = 
a  is  fixed,  the  associated  LRT  requires  complete  knowledge  of  the  PDFs  po(x)  and  pi(x).  In 
most  practical  applications,  this  knowledge  is  unavailable  and  may  have  to  be  estimated  from  the 
data.  Also,  the  input  data  statistics  may  very  with  time  or  may  change  from  one  application  to 
the  other.  Further,  for  many  detection  problems  the  exact  form  of  the  LRT  may  be  too 
complicated  to  implement.  Therefore,  suboptimal  detectors  featuring  simplicity  and  robustness 
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are  often  preferred  [Thomas  1970],  To  improve  a  suboptimal  detector  detection  performance, 
two  approaches  are  widely  used.  In  the  first  approach,  the  detector  parameters  are  varied  [Zozor 
1999,  2002,  2003],  [Saha],  Galdi].  Alternatively,  when  the  detector  itself  cannot  be  altered  or 
the  optimum  parameters  are  difficult  to  obtain,  adjusting  the  observed  data  becomes  a  viable 
approach.  Thus,  in  the  work  reported  here,  prime  consideration  was  given  to  applications  in 
which  such  detector  parameters  could  not  be  changed;  i.e.,  the  detector  parameters  were 
considered  to  be  fixed  in  terms  of  both  the  test  statistic  and  the  threshold.  This  is  often  the  case 
for  applications  where  the  signal  processing  methodology  is  not  under  the  user’s  control.  In  such 
cases,  we  consider  the  alternative  approach  of  utilizing  stochastic  resonance. 

It  is  well  known  that  the  detection  performance  can  be  improved  by  adding  additional  noise 
that  is  statistically  dependent  on  the  existing  noise  and/or  with  PDF  that  depends  in  which 
hypothesis  is  true  [Kay  2000].  However,  adding  a  dependent  noise  is  not  always  possible 
because  pertinent  prior  information  is  usually  not  available.  Therefore,  we  constrain  the  additive 
noise  to  be  independent  noise.  For  some  suboptimal  detectors,  as  noted  in  [Kay  2000],  detection 
performance  can  be  improved  by  adding  such  noise  to  the  data  under  certain  conditions.  For  a 
given  type  of  SR  noise,  the  optimal  amount  of  noise  can  be  determined  that  maximizes  the 
detection  performance  for  a  given  suboptimal  detector  [Inchiosa].  In  an  effort  to  explain  this 
noise  enhanced  phenomenon  for  some  integrate-and-fire  neuron  models,  [Tougaard] 
demonstrated  that  the  detection  performance  gain  is  caused  by  the  nonlinear  properties  of  the 
spike-generation  process  itself.  However,  despite  the  progress  made  in  the  literature,  the 
underlying  mechanism  of  this  Stochastic  Resonance  phenomenon  in  detection  problems  has  not 
been  fully  explored.  For  example,  an  interesting  problem  is  the  determination  of  the  best  ‘noise’ 
to  be  added  in  order  to  achieve  the  best  achievable  detection  performance  for  the  suboptimal 
detector.  In  this  case,  the  detection  problem  can  be  stated  as:  Given  that  the  test  is  fixed,  i.e.,  the 
critical  function  <(>(■),  as  for  example  T  and  r),  is  fixed,  can  we  improve  the  detection  performance 
by  adding  SR  noise?  And  if  so,  what  is  its  PDF  to  maximize  PD  without  increasing  PFa? 

Here,  a  theoretical  analysis  is  presented  to  gain  further  insight  into  the  SR  phenomenon  and 
the  detection  performance  of  the  noise  modified  observations  is  obtained.  Further,  the  optimum 
noise  PDF,  i.e.,  not  only  the  noise  level  but  also  the  noise  type  is  determined.  As  an  illustrative 
example,  the  optimum  noise  PDF  as  well  as  several  suboptimum  noise  PDFs  are  derived  for  the 
sign  detector.  We  emphasize  that  compared  to  some  prior  work  where  one  or  several  nonlinear 
systems  are  inserted  between  the  final  detector  and  the  original  input  signal,  here,  by  considering 
the  decision  function  <j)  in  general  and  the  fact  that  we  may  consider  the  entire  system  between 
the  input  signal  and  output  detection  resul  as  a  ‘super’  detector,  the  results  obtained  in  this  paper 
can  be  applied  for  any  detection  system  with  any  type  of  fixed  SR  preprocessing  system.  Also, 
compared  to  the  earlier  definitions  of  SR  [Benzi  1981],  [Gammaitoni],  we  further  extend  the 
concept  of  ‘SR’  to  a  pure  noise  enhanced  phenomenon,  ie.,  a  phenomenon  of  some  nonlinear 
systems  in  which  the  system  performance  is  enhanced  due  to  the  addition  of  independent  noise  at 
the  input.  In  the  work  reported  here,  the  terminologies  ‘SR’  and  ‘noise  enhanced’  are  used 
interchangeably.  However,  we  point  out  that  the  later  is  actually  a  generalization  of  the  former. 
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2.2  Problem  Formulation 


In  order  to  achieve  a  possible  enhancement  of  detection  performance,  we  add  noise  to  the 
original  data  process  x  and  obtain  a  new  data  process  y  given  by 

y  =  x  +  n,  (2.6) 

where  n  is  either  an  independent  random  process  with  PDF  p„(n)  or  a  nonrandom  signal.  Note 
that  here  we  do  not  have  any  constraint  on  n.  For  example,  n  can  be  white  noise,  colored  noise, 
or  even  a  deterministic  signal  A,  corresponding  to  p„(n)  =  5(n  -  A).  As  will  be  shown  later, 
depending  upon  the  detection  problem,  an  improvement  of  detection  performance  may  not 
always  be  possible.  In  that  case,  the  optimal  noise  is  equal  to  zero.  The  PDF  of  y  is  expressed  by 
the  convolution  of  the  PDFs  such  that 

Py  (y)  =  P,  (x)  *  pn  (x)  =  Jpx  (x)pn  (y  -  x)dx .  (2.7) 


The  binary  hypothesis  testing  problem  for  this  new  observed  data  y  can  be  expressed  as 


H0-.py(r,Ha)^  J/;v  Po  (x)pn  (y  -  x)dx 
H\  ■Py(y',Hl)=  Jk  px  (x)pn  (y  -  x)dx 


(2.8) 


Since  the  detector  is  fixed,  i.e.,  the  critical  function  <(>  of  y  is  the  same  as  that  for  x,  the 
probability  of  detection  based  on  the  data  y  is  given  by, 

Pd  =  \RJ{y)Py{y\Hx)dy 

=  I,  ^(y)  {,  A  (*)/>„  (y  -  x)dxdy 
=  J/(,  P\  (x)  ( j/;,  </>&)  Pn  (y  -  x)dy'jdx 

=  j/(.  Pi  (x)Cnj(x)dx  =  £,[Cm(x)]  (2.9) 

where 

(x)  =  [v  <f>(y)pn  (y  -  x)dy  (2.10) 

Aternatively, 

PD  =  Pn  (x)  ( </>(j)Pi  (y  -  x)dy )  dx 

=  \FlJ(x)pn(x)dx  =  EB(FK,(x))  (2.11) 


Similarly,  we  have 
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where 


(2.12) 

(2.13) 


?fa  =  £*  Po(x)Cn/x)dx  =  F0[Cn/x)] 
=  (x)pn  (x)<fr  =  Fn  (F0i  (x)) 


^  (x)  =  £*  ^(y )p,  (y  -  ■ x)^y  j  =  °> 1 >  (2  • 1 4) 

and  Fi;*(x)  corresponds  to  hypothesis  H\,  and  E\{-),  F„()  are  the  expected  values  based  on  the 
distributions  pj(-)  and  p„(-),  respectively.  Note  that  Fj(x)  has  the  property  that  PFXA  =  F0j  .(O)and 

P*  =  Fl  ^(0) .  To  simplify  notation,  we  omit  the  subscript  <j>  of  F  and  C  and  denote  them  as  F\,  Fo 

and  Cn,  respectively.  Further,  from  (2.14),  F\  (xo)  and  F0(x0)  are  actually  the  probability  of 
detection  and  probability  of  false  alarm,  respectively,  for  this  detection  scheme  with  input  y  =  x 
+  xo.  For  example,  F i(-2)  is  the  Po  of  this  detection  scheme  with  input  x  -  2.  Therefore,  it  is 
very  convenient  for  us  to  obtain  the  F\  and  Fo  values  by  analytic  computation  of  values  by 
analytic  computation  if  p0,  p\  and  <j)  are  known.  When  they  are  not  available,  F\  and  Fo  can  be 
obtained  from  the  data  itself  by  processing  it  through  the  detector  and  recording  the  detection 
performance.  Thus,  it  is  not  necessary  to  have  complete  knowledge  regarding  <(>(•)  and  pf). 
From  (2.1 1)  and  (2.13),  we  may  formalize  the  optimal  SR  noise  definition  as  follows. 

Consider  the  two  hypothesis  detection  problem  as  in  (2.1).  The  PDF  of  the  optimum  SR 
noise  is  given  by 

PT  =  ar§  max  L  Fi  WPn  (x)dx  (2-15) 

p„  JR 

where 

1)  Pn(x)  >0,xe  Rn 

2)  P„(XVX  =  1 

3)  ^F;(x)p„(xyx^^o(o)- 

Conditions  1)  and  2)  are  fundamental  properties  of  a  PDF  function.  Condition  3)  ensures  that 
P*A  <  PpA ,  i.e.,  the  ?fa  constraint  under  the  Neyman-Pearson  criterion  is  satisfied.  Further,  if  the 

inequality  condition  in  3)  becomes  equality,  the  constant  false  alarm  rate  (CFAR)  property  of  the 
original  detector  is  maintained. 

2.3  Optimum  SR  Noise  for  Neyman-Pearson  Detection 

In  general,  it  is  difficult  to  find  an  exact  form  of  /?„(•)  directly  because  of  condition  3). 
However,  an  alternative  approach  considers  the  relationship  between  pn(x)  and  F\(x).  From 
(2.14),  for  a  given  value  fo  of  F0,  we  have  x  =F0''(/0),  where  F0_I  is  the  inverse  of  Fo.  When  Fo 
is  a  one-to-one  mapping  function,  x  is  a  unique  vector.  Otherwise,  F0_1(/0)  is  a  set  of  x  for 
which  Fo(x)  =f0.  Therefore,  we  can  express  a  value  or  a  set  of  values  f  of  F\  as 
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fi=Fl(x)  =  Fl(F0-\f0)). 


(2.16) 


Given  the  noise  distribution  of  /?„(•)  in  the  original  domain,  pn  f  (•) ,  the  noise  distributiom  in 

the  f0  domain  can  also  be  uniquely  determined.  Further,  the  conditions  on  the  optimum  noise  can 
be  rewritten  in  terms  of fo  equivalently  as 


4)  Pn,/0(^o)  “  0 

5)  Jp 

6)  J/.P..A  </•>«*  IV. 


K~{f>  P.j,  (/.)#..  (2-17) 

where  pn  fo{f0)  is  the  SR  noise  PDF  in  the  fo  domain.  Compared  to  the  original  conditions  1),  2) 

and  3),  this  equivalent  form  has  some  advantages.  First,  the  problem  complexity  is  dramatically 
reduced.  Rather  than  searching  for  an  optimal  solution  in  R!  ,  an  optimal  solution  is  sought  in  a 
single  dimensional  space.  Second,  by  applying  these  new  conditions,  we  avoid  the  direct  use  of 
the  underlying  PDFs  pi(  )  and  po(-)  and  replace  them  with  f  and  fo,  respectively.  Note  that  in 
some  cases,  it  is  not  easy  to  find  the  exact  form  of  f  and  fo.  However,  recall  that  Fi(xo)  and 
Fo(xo)  are  the  probability  of  detection  and  false  alarm,  respectively,  of  the  original  system  x  +  xo. 
In  practical  applications,  we  may  learn  the  relationship  by  Monte  Carlo  simulation  using 
importance  sampling.  In  general  compared  to  p\  and  po,f  and  fo  are  much  easier  to  estimate  and 
once  the  optimum  pn/o  is  found,  the  optimum  _p„(x)  is  determined  as  well  by  virtue  of  the 

inverse  of  the  function  Fo  and  F\. 

Let  us  now  consider  the  function  J(t)  such  that  J( t)  =  sup(/j:  fo  =  t)  is  the  maximum  value  of 
f  given  fo.  Clearly,  J(PpA)  >  Fj(0)  =  .  From  (2.17),  it  follows  that  for  any  noise  pn,  we  have 


(2.18) 


Therefore,  the  optimum  Pg  is  attained  when f(fo)  =  J(fo)  and  Pfopl  =  En(J) . 

A.  Determination  of  SR  Detection  Improvement 

A  significant  contribution  of  the  theoretical  framework  provided  here  is  the  capability  to 
determine  if  SR  will  indeed  provide  performance  improvement  in  a  given  problem.  Detection 
enhancement  results  using  SR  results  if  Pf  op!  >  PXD  .  However,  it  requires  complete  knowledge  of 
Fo{)  and  Ff)  as  well  as  significant  computation.  For  a  large  class  of  detectors,  however, 
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depending  on  the  specific  properties  of  J,  we  may  determine  the  sufficient  conditions  for 
improvability  and  non-improvability  more  readily.  These  are  given  in  the  following  theorems 
the  proofs  of  which  are  contained  in  [Chen  2006b], 

Theorem  1  (Improvability  of  Detection  via  SR):  If  J( PrxA )  >  P*  or  J"(PFxA)>0  when  J(t)  is 
second  order  continuously  differentiable  around  PFXA ,  then  there  exists  at  least  one  noise  process 
n  with  PDF  pa(-)  that  can  improve  the  detection  performance. 

Theorem  2  (Non-improvability  of  Detection  via  SR):  If  there  exists  a  non-decreasing  concave 
function  q/(/o)  where  vp(  PFA  )  =  J(  PFA  )  =  Fi(0)  and  vp(/o)  >  J(fo)  for  every  fi>,  then  <  P^  for  any 
independent  noise,  i.e.,  the  detection  performance  cannot  be  improved  by  adding  noise. 

B.  Determination  of  the  Optimum  SR  Noise  PDF 

Another  very  significant  result  of  the  theoretical  framework  is  the  determination  of  the  exact 
form  of  subject  to  the  constraint  PFA  <  PFA  .  This  is  contained  in  Theorem  3  [Chen  2006b], 


Theorem  3  (Form  of  Optimum  SR  Noise):  To  maximize  P^  under  the  constraint  PFA  <  PFA ,  the 
optimum  noise  can  be  expressed  as1 


P°npl (n)  -  AS(n  -  n,)  +  (1  -  A)S(n  -  n2) 

(2.19) 

where  0  <  X  <  1 .  Specifically,  to  obtain  the  maximum  achievable  performance  given  the  false 
alarm  constraint,  the  optimum  noise  is  a  randomization  of  two  discrete  vectors  added  with 
probability  X  and  (1  -  X),  respectively.  It  can  also  be  shown  [Chen  2006b]  that 

p Tj, = W.-/.,)+0-W(/.-/„) 

(2.20) 

where  fo\  =  Fo(ni)  and  fo 2  =  Fo(n2).  Alternatively,  the  optimum  SR  noise  can  also  be  expressed 
in  terms  of  Cn,  such  that 

CT  (x)  =  2<f>(x  +  n, )  +  (1  -  2)(j){x  +  n2) . 

(2.21) 

From  (2.20),  we  have 

P^,=W01)+(l-^V(/o2) 

and 

(2.22) 

(2.23) 

1  This  form  of  optimum  noise  PDF  is  not  necessarily  unique.  There  may  exist  other  forms  of  noise  PDF  that  achieve 
the  same  detection  performance. 
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C.  Determination  of  the  PDF  of  Optimum  SR  Noise 

Depending  upon  the  location  of  the  maxima  of  J(-),  we  have  the  following  theorem. 

Theorem  4  Let  Fm  =  max(J(t))  =  to  and  t0  =  arg  min(  J(t)  =  Fm).  It  follows  that 

Case  1 :  If  t0  <  ?FA ,  then  P*A  npl  =  t0  and  VyDopt  =  Fm ,  i.e.,  the  maximum  achievable  performance 
is  obtained  when  the  optimum  noise  is  a  DC  signal  with  value  no  or 

pT(n)  =  S(n-n0)  (2.24) 

where  Fo(no)  =  to  and  F’i(no)  =  Fm-  Here,  the  maximum  probability  of  detection  is  achieved 
subject  to  the  probability  of  false  alarm  constraint  PFyA  <  PFA  by  adding  a  constant  to  the  input 
with  a  value  that  depends  upon  the  decision  regions  and  the  probability  density  functions  under 
the  two  hypotheses.  However,  the  threshold  must  be  varied  to  maintain  PFyA  <  PFA  . 

Case  2:  If  t0> ?FA,  then  Pf,A  opl  =F0(0)  =  P^,  i.e.,  the  inequality  of  (2.23)  becomes  equality. 
Furthermore, 


P^=^/o,+(l-^)/o2=P FA-  (2-25) 

In  this  case,  the  probability  of  false  alarm  is  maintained  without  a  threshold  variation.  Thus,  the 
CFAR  property  is  achieved. 

2.4  A  Detection  Problem  Example 

The  above  theoretical  methodology  was  applied  to  the  problem  considered  by  [Kay  2000], 
Given  observation  data  x[n],  n  =  1,  2,  ...,  N,  we  consider  the  binary  hypothesis  testing  problem 
such  that 


| H0 :  x[n]  =  w[n]  n  =  1 ,  2, ...,  N 

[Hl : x[n]  =A  +  w[n]  n=l,2,...,N 

mvoWmg  detection  of  a  known  dc  signal  level  A  >  0  in  i.i.d.  symmetric  Gaussian  mixture  noise 
w[n]  with  PDF 


Pw(w)=  ^  y(w;  -//,  )  +  ^  r(  w;  M,  ) 

where 

r(w;/u,cr20)=  .  1  exp 
'liner2 


(w-A)2 

2cr2 


(2.27) 

(2.28) 
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Here,  p.  =  3,  A  =  1  and  ctq  =  1.  The  suboptimal  sign  detector  is  considered  with  test  statistic 


rw4!(HsgH 


N 


N-l 


Stew)- 


(2.29) 


where  vjx[i\  =  — +  ^sgn(x[/])  From  the  second  equality  in  (2.29),  we  consider  this  detector  as 

essentially  the  fusion  of  the  decision  results  from  N  i.i.d.  sign  detectors.  For  N  =  1,  the  detection 
problem  reduces  to  a  problem  with  test  statistic  T\{x)  =  x,  threshold  q  =  0  (sign  detector),  and 
probability  of  false  alarm  PFA  =  0.5.  The  distribution  of  x  under  the  Ho  and  H\  hypotheses  can  be 
expressed  as 

Po  (*)  =  ^  r(x;  -p,  rte  /*,  )  (2-30) 

and 

Pi  (x)  =  ^  y  (x; -p  +  A,  cr20 )  +  ^  y(x;  p  +  A,<j2),  (2.40) 


respectively.  The  critical  function  is  given  by 


</{x)  = 


x  >  0 
x  <  0 


(2.41) 


The  problem  of  determining  the  optimal  SR  noise  is  to  find  the  optimal  pn(n)  where  for  the  new 
observation  y  =  x  +  n,  the  probability  of  detection  =  p{y  >  0;//,)  is  maximum  while  for  the 

probability  of  false  alarm,  ¥yFA=p(y>Q\H0)<PxFA=\.  It  follows  [Chen  2006b]  that  the 
resulting  optimal  SR  noise  PDF  is 

p^'(n)  =  A<5>(n  -n,)  +  (l-X,)5(n-n2)=  0.30855(n  +  3.5)  +  0.69155(n  -  2.5).  (2.42) 

It  also  follows  that  for  the  case  of  optimal  SR  noise,  but  now  constrained  to  have  a  symmetric 
PDF,  p^'  (x) ,  where  p"p'  (jc)  =  pf  (-x) ,  we  have  for  the  example  considered  here 

pTi.x)^8{x-p)  +  h{x  +  p).  (2.43) 

Performance  results  are  shown  in  Fig.  2.1.  First,  let  us  consider  Fig.  2.1a  which  plots  the 
curve  U  =  (f\,fo)  where  f  =  F\(x)  and  fo  =  T’o(x).  This  curve  is  significant  in  that  it  reveals  the 
potential  for  detection  performance  improvement  via  SR.  This  is  observed  by  again  noting  that 
for  the  original  data  process  x,  PFXA  =F0(0)andPD  =F](0).  For  the  problem  considered  here, 
PFA  =F0( 0)  =  /0  =0.5  and  Px  =F,(0)  =  f  =  0.51  yielding  detection  probability  barely  above 
Pfa-  However,  as  indicated  by  the  curve  that  upper  bounds  the  convex  hull,  V,  which  contains 
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all  possible  Pd  and  Pfa  values  after  SR  is  added,  P*  =  0.6915.  Thus,  the  region  for  potential 

performance  improvement  via  SR  is  the  region  of  the  convex  hull  V  that  lies  above  the  curve  U  = 
( fufo) ■  Further,  this  indicates  that  if  the  curve  U  is  a  convex  function,  there  is  no  potential  for 
performance  improvement  via  SR.  Thus,  the  curve  U  =  provides  an  important  diagnostic 
tool  for  the  determination  of  potential  performance  enhancement. 

Fig.  2.1b  plots  P^  versus  the  signal  level  A.  The  lowest  curve  plots  P^  with  no  SR  noise. 

As  expected,  it  increases  with  increasing  signal  level.  The  next  three  curves,  which  lie  just 
above,  utilized  the  optimum  PDFs  for  three  symmetric  SR  noise  cases.  Ranging  from  lowest  to 
highest,  the  SR  noise  consisted  of  white  Gaussian,  uniform,  and  the  optimal  symmetric  SR  noise 
of  (2.43),  respectively.  We  observe  that  for  the  symmetric  SR  noise  cases,  the  curves  all 
converge  at  A  =  p  to  a  common  P^  value  which  lies  on  the  P^  curve.  Thus,  for  values  of  A  >  p, 
no  performance  improvement  is  obtained  using  symmetric  SR  noise.  However,  for  small  values 
of  A,  say  0  <  A  <  0.6,  the  P^  values  of  the  symmetric  optimal  SR  noise  case  achieves  the  same 

level  of  performance  as  those  obtained  for  the  case  of  optimal  SR  noise  with  PDF  given  by 
(2.42).  For  A  >  0.6,  however,  the  latter  optimal  noise  case  achieves  superior  performance  levels 
closer  to  the  optimal  likelihood  ratio  test  (LRT). 

Fig.  2.1c  considers  the  detection  performance  dependence  upon  the  background  noise 
standard  deviation,  er0 .  The  lowest  curve  plots  P*  versus  cr0 ;  i.e.,  the  case  for  which  no  SR 

noise  is  added.  We  observe  that  as  cr0  increases,  P£  increases  until  the  noise  standard  deviation 
reaches  the  level  cr,  =  2.942.  For  low  values  of  cr0  (high  SNR),  however,  the  optimum  SR  noise 
enhanced  detector  reaches  P^  «  1,  while  for  the  symmetric  SR  noise  enhanced  detectors,  the 
performance  is  reduced.  As  er0  increases,  the  performance  of  the  SR  enhanced  detectors 
converges  to  the  P^  value  at  cx0  =  cr, .  This  results  from  the  convergence  of  the  bimodal 
Gaussian  background  noise  PDF  po(x)  to  a  unimodal  PDF  as  cr0  approaches  cr, .  At  this  point, 
the  decision  function  <f>{\)  and  the  LRT  test  are  equivalent  for  P^  =  0.5.  Thus,  adding  SR  noise 
will  not  improve  Pd  and  all  the  detection  results  converge  toP^ . 

Fig.  2. Id  shows  each  detector’s  performance  with  respect  to  p  for  A  =  1  and  a0  =  1.  All  the 

detectors  show  a  decrease  in  performance  as  p  increases  from  zero.  However,  for  p  >  po  ~  1.5, 
the  optimum  LRT  begins  to  increase  with  increasing  p.  To  explain  this  effect,  we  note  that  for  p 
«  po,  the  bimodal  Gaussian  noise  is  approximately  unimodal.  However,  as  these  two  Gaussian 
mixture  peaks  separate,  the  detectability  initially  decreases  until  the  peaks  are  sufficiently 
separated.  As  p  -»  oo,  the  peak  separation  is  sufficient  such  that  the  background  noise  PDF  is 
essentially  unimodal  for  signal  level  A  >  0. 

Finally,  Fig.  2.2  shows  the  ROC  curve  for  this  problem,  but  now  with  N  =  30.  Again,  the 
superior  performance  of  the  optimal  SR  noise  enhanced  detectors  is  observed  compared  with  the 
cases  of  uniform  and  Gaussian  SR  noise.  Specifically,  the  detection  performance  is  much  closer 
to  the  optimal  LRT  curve. 
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Fig.  2.1  a.)  Plot  of  U  =  (/i,/o)  where  f  =  Fi(x)  and /0  =  F0(x),  b.)  P^  versus  signal  level  A,  c.)  P^ 
versus  ct0,  d.)  P*  versus  fi. 


3.0  An  Alternative  Derivation  for  the  SR  Detection  Framework 


The  following  derivation  [Kay,  2005]  addresses  the  problem  of  deciding  between  two 
hypotheses  based  on  a  single  sample.  The  sample  may  be  a  single  data  sample  or  a  test  statistic. 
The  scalar  test  statistic  is  x,  which  under  Ho  has  PDF  po(x)  and  under  Hi,  pi(x).  It  is  assumed 
that  we  decide  Hi  if  x  >  0.  Now  consider  the  same  detector  but  replace  x  by  y  =  x  +  c,  where  c  is 
a  random  variable  that  is  independent  of  x.  Then,  we  decide  Hi  if  y  =  x  +  c  >  0.  The  PDF  of  c  is 
pc(c)  and  it  is  this  PDF  that  is  of  interest.  We  allow  the  use  of  impulses  in  the  PDF  so  that  c  may 
be  either  continuous,  discrete  or  mixed  variable.  For  this  detector,  the  probability  of  false  alarm 
and  probability  of  detection  are 


oo 


Pfa=  \pl(y)dy 

0 

(3.1) 

00 

Pd=  \pYs(y)dy 

0 

(3.2) 

where  the  PDFs  of  y  are 

co 


pl(y)=  \pfy-c)pc{c)dc 

-CO 


p\(y)=  \pfy-c)pc{c)dc 


which  follows  from  the  usual  result  for  the  sum  of  two  independent  random  variables.  Explicitly 
then 


CO  CO 

Pfa=  j  \p0(y-c)pc(c)dc  dy 
0  00 
00  00 

=  j  \pfy-c)dypc{c)dc 

—co  0 
00 

=  JPFA  {c)pc(c)dc 


where  Pfa(c)  =  J  pfy~c)dy  denotes  the  conditional  probability  of  false  alarm.  It  is  the 
o 

probability  of  false  alarm  conditional  on  observing  C  =  c.  Note  that  the  unconditional 
probability  of  false  alarm  PFa  is  just  Ec[Pfa(C)],  where  Ec  denotes  expectation  with  respect  to 
the  PDF  pc(c).  Similarly,  we  have 

oo 

Pd  =  |PD(c)  pc  (c)dc  =  Ec  [Pd(C)] 
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where 


Pd(c)=  \px(y~c)dy. 

0 

It  is  interesting  to  note  that  PD(c)  is  just  the  probability  of  detection  for  the  original  detector 
but  with  the  threshold,  originally  given  by  zero,  replaced  by  -c.  This  is  because  we  decide  Hi  if 
y  =  x  +  c>0or  equivalently  if  x  >  -c.  The  threshold,  however,  is  a  random  variable  and  hence 
the  overall  detector  performance  is  given  by  the  expected  value  of  Pd(c). 

Our  problem  has  now  reduced  to  the  following.  Choose  pc(c)  so  that  Ec[Pd(C)]  is 
maximized  subject  to  the  constraint  that  Ec[Pfa(C)]  =  Vi.  (We  assume  continuity  of  the  original 
ROC  so  that  the  false  alarm  constraint  is  an  equality).  To  proceed  further  it  is  useful  to  simplify 
Pfa(c)  and  Pd(c).  Consider 


Pfa(c)  =  J  p0(y-c)dy 
0 
CO 

=  \p0(t)dt  (let  t  =  y  -  c). 


Now  note  that  in  this  form  it  is  clear  that  Pfa(c)  is  just  the  complementary  distribution 
function  or  right-tail-probability.  Also,  as  such,  it  is  obvious  that  as  c  increases  (-c  decreases), 
that  Pfa(c)  also  increases.  Thus,  Pfa(c)  is  monotonically  increasing  with  c.  It  is  well  known  that 
if  a  function  is  monotonically  increasing,  then  the  inverse  function  exists  and  it  too  is 
monotonically  increasing;  for  example,  g(x)  =  exp(x).  Thus,  we  will  use  a  variable  change  by 
letting  u  =  Pfa(c),  where  if  -oc  <  c  <  oc,  we  must  have  0  <  u  <  1 .  Also  the  inverse  function  is 
denoted  as  c  =  Pp\  («) .  Hence,  we  now  have  the  new  random  variable  U  =  Pfa(C)  and  therefore 

1 

Pfa  =  Ec[Pfa(C)]  =  Eu[U]  =  jupv(u)du 

0 

and  similarly 

PD  =  Ec[Pd(C)]  =  Eu[PD(PpI(t/))]  =  |PD(P;’  (u))Pu(u)du  . 

0 


Recall  that  we  desire  Pfa  =  1/2.  Thus,  the  equivalent  optimization  problem  is  to  maximize 


J(Pu)=  |PD(PFi  (u))Pu(u)du 

0 

subject  to  the  constraint  that 


(3.3) 
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jupu(u)du  =  Vi. 

0 

However,  it  is  more  convenient  to  write  the  constraint  as 

i 

J(«  -1/2) pu  (u )du  =  0  =  Eu[U  -  Vi] 

0 

and  to  define  a  new  random  variable  W  =  U  -  Vi  so  that  the  constraint  becomes  Ew[W]  =  0. 
Then,  J(pu)  becomes  from  (3.3) 

J(pu)  =  Eu[PD(Ppi(t/))]  =  Ew[PD(Ppi(^  +  l/2))]  =  J(pw) 

where  explicitly 

1/2 

J(pw)  =  j  PD(P;i(w  +  l/2))Ar(wyw 
-1/2 
1/2 

=  J  g{w)pw{w)dw 
-1/2 
and 

g(w)=PD(P;i(w  +  l/2)). 

Summarizing,  we  wish  to  maximize  over  pw(w)  the  functional 

1/2 

J  gMpw(w)dw 
-1/2 

subject  to  the  constraint  that  Ew[W]  =  0.  Note  that  the  random  variable  W  =  Pfa(C)  -  Vi  takes  on 
values  in  the  interval  [-1/2,  1/2].  We  can  further  simplify  the  problem  by  maximizing  the 
functional 

1/2 

J  (g(w)  -{w  +  M  2))pw  (w)dw 
-1/2 

ri/2 

since  (w  + 1/2)) pw  (w)dw  =  Vi  due  to  the  Ew[ W]  =  0  constraint.  Letting 

J— 1/2 

h(w)  =  g(w)  -  (w  +  Vi) 
which  is  explicitly 

h(w)  =  Pd(PFa(w  +  1/2))  -  (w  +  Vi)  (3.4) 
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we  finally  seek  to  maximize 


1/2 

J  h(w)pw(w)dw 
-1/2 

subject  to  the  constraint  that  Ew[W]  =  0.  Note  that  the  function  h(w)  takes  on  nonnegative 
values  if  Pd(c)  >  Pfa(c)  (the  ROC  for  a  variable  threshold  of  the  detector  that  decides  Hi  if  x  >  -c 
is  above  the  45°  line).  This  is  because 

h(w)  =  Pd(P;I(w  +  1/2»  -  (w  +  y2)  =  PD(PFI(w»  -  u  =  PD(c)  -  Pfa(c)  >  0. 

Also,  at  the  end  points  of  the  [-1/2,  !4]  interval  in  w  we  have 

h(-l/2)  =  PD(P;i(0))  =  PD(+°o)  =0 
^(1/2)  =  PD(P^(1»  - 1  =  PD(-«>)  -1=0. 

A  typical  plot  of  h{w)  is  shown  in  Fig.  3.1.  This  example  will  be  used  later. 


Fig.  3.1  Plot  of  the  h(w )  function  for  the  example  problem. 


In  essence,  we  have  modified  the  function  Fi  versus  F0  in  Section  2.0  so  that  it  is  plotted  as 
the  difference  between  the  function  and  the  45°  line.  (See  also  Fig.  2.1a  for  F]  versus  Fo).  Also, 
it  has  been  shifted  to  be  defined  over  the  symmetric  interval  [-1/2,  1/2],  This  has  the  advantage 
of  simplifying  extensions  of  the  results  given  here.  In  addition,  the  maxima,  which  will  be 
required  later,  are  easily  found  numerically  using  efficient  routines  such  as  a  golden  search. 

The  problem  now  is  to  maximize  the  functional 
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(3.5) 


1/2 

j(pw)  =  |  h(w)pw(w)dw 
-1/2 

subject  to  the  constraint  that  Ew[W]  =  0.  The  function  h(w)  is  nonnegative  over  the  interval  [-1/2, 
1/2].  To  simplify  the  discussion  we  will  assume  that  h(w)  has  a  unique  maximum  over  the 
interval  (0,1/2),  and  a  unique  maximum  over  the  interval  [-1/2,  0].  The  maximum  is  assumed  not 
to  occur  at  w  =  0.  Also,  it  is  assumed  that  the  maximum  values  are  equal.  These  assumptions 
are  satisfied  for  the  example  given  in  [Kay,  2000].  (More  rigorous  and  general  results  can  be 
obtained  using  standard  theorems  in  analysis  such  as  ‘continuous  functions  on  compact  sets’, 
etc.).  Once  the  optimal  pw  is  found,  the  PDF  for  C  or  pc  can  be  found  by  transforming  back  to  C 
using  the  relationship  W  =  Pfa(C)  -  lA  and  the  standard  results  in  transformation  of  random 
variables.  Note  that  if  the  maximum  of  h(w )  over  the  interval  [-1/2,  1/2]  were  to  occur  at  w  =  0, 
then  the  constraint  Ew[W]  =  0  would  be  satisfied  for  pw(w)  =  8(w)  and  J(pw)  would  also  be 
maximized.  Thus,  the  solution  would  be  to  choose  c  =  P^(w  +  l/2)  =  Pj^(l/2)  =  0  and  no 
improvement  in  performance  would  be  possible. 

Continuing,  we  let  w.  be  the  value  that  maximizes  h(w )  for  w  <  0  and  w+  be  the  value  that 
maximizes  h(w)  for  w  >  0  (w  =  0  is  excluded),  and  assume  that  h(w.)  =  h(w+).  Let  the  set  A 
denote  the  remaining  portion  of  the  interval  [-1/2,  1/2]  so  that  {  w.,  w+}  u  A  =  [-1/2,  1/2]  and 
{w.,  vv+}  n  A  =  0 .  Next,  represent  pw(w)  as 

Pw(w)  =  a(  w-)8(w  -  w.)  +  a(  w+)8(w  -  w+)  +  Pw(w)Ia(w)  (3 .6) 

where  8  denotes  the  Dirac  delta  function  and  IA(w)  =  1  for  w  e  A  and  is  zero  otherwise  (the 
indicator  function).  Actually,  any  PDF  may  be  decomposed  this  way  subject  to  the  condition 
that 


a(  w.)  +  a(  w+)  +  [  pw  ( w)dw  =  1 . 
Ja 


Using  this  in  (3.5)  produces 

J(pw)  =  h(w.) a(  w.)  +  h(w+)a.(  vr+)  +  |  h(w) p,v (w)dw .  (3.7) 

Ja 

In  order  to  satisfy  the  constraint,  we  must  have  that  pw(w)  has  mass  for  w  >  0  and  w  <  0  (or 
the  probability  of  W  being  negative  is  nonzero  and  the  probability  of  W  being  positive  is  also 
nonzero).  Otherwise,  we  could  not  have  Ew[W]  =  0.  The  constraints  on  pw(w)  form  the  two 
linear  equations 

a(  w.)  +  a(  w+)  +  £  pw  ( w)dw  =  1 
a(  w.)  w.  +  a(  w+)  w+  +  £  wpw  ( w)dw  =  0. 
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In  matrix  form  the  constraints  are 


Aa  =  b 


or 


'  i  r 

a(  w.) 

1-  IPwMd* 

w_  w+ 

_a(w+)_ 

-\wpw(w)dw 

where  A,  a  and  b  are  defined  implicitly.  With  h  =  [h(w.)  h(w+ )]  ,  we  have  from  (3.7) 


J(pw)  =  hTa+  \Ah(yv)pw(w)dx v 
=  hTA_1  b  +  |  h(w ) pw  (w)dw 


=  [h(w-)  h(w+)] 


w+  -1 
-w  1 


-w 


1-  \APw 


L-\wpw{w)dw 

w+h(w_)  -  w_h(w+)  h(w+)  -  h(w_) 


+ 


w  -w 


w.-w 


\-\pw(w)dw 

rJA 

-jwpw(w)dw 


jA  h(w)plv  (w)dw 


We  point  out  that  the  two  terms  in  the  first  bracket  can  be  expressed  as 


and 


w+h(w_)-  w_h(w+) 
w+  -w_ 


=  h(w+) 


h(w+)-h(w_)  _Q 
w+  -  w_ 

Recalling  that  h(w.)  =  h(w+ )  and  recognizing  that  h(w)  <  h(w+)  for  all  w  e  A,  we  have  that 
J(pw)  =  h(w+)  +  f  (h(w)  -  h(w+ )) pw  (w)dw 

JA 

<  h(w+). 

Clearly,  the  upper  bound  is  attained  when  pw(w)  =  0  for  all  we  A.  This  results  in  the  solution 
from  (3.6)  of 

pw(w)  = — — — 5{w-w_)  +  — — — S(w-w+)  (3.9) 

w+  -  w_  w_  -  w+ 

where  we  have  solved  for  a(w_)  and  a(w+)  by  using  (3.8)  with  the  right-hand-side  vector  being  [1 
0] r.  Since  W  can  only  take  on  values  w.  and  w+,  it  follows  that  the  only  values  C  can  take  on  are 
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c.=  Pf^(w_+1/2) 

c+  =  Pfa(w+  +  1/2) . 

The  optimal  PDF  for  C  is  therefore 


(3.10) 


Pc(c)=  -^^S(c-c_)  +  -^—S(c-c+)  (3.11) 

M'+  -W_  W_~  W+ 

where  c.  and  c+  are  given  by  (3.10)  and  w.  and  w+  are  the  values  that  maximize  the 
PD(Pp^(w  +  l/2))-(w  +  l/2)  for  -1/2  <  w  <0  and  0  <  w  <  14,  respectively. 

We  again  consider  the  example  [Kay  2000]  of  Section  2.0  and  assume  only  one  sample  as 
described  above.  The  PDFs  are  given  by 

1  l  f  l  N 

P°(x)=— 7=exP  -~(*-3)2  +— 7=exp  --(x  +  3)2 
2^]2n  \  2  y  2yj2n  \  2  y 

pi(x)  =  p0(x  -  1). 

Since  for  a  fixed  c  we  decide  Hi  if  x  >  -c,  we  have 

PFA(c)=^0(-c-3)  +  ie(-C  +  3) 

P  d(c)  =  ~  Q(-c  -  4)  +  ^  Q(-c  +  2) 

where  Q(x)  is  the  right-tail  probability  for  a  standard  normal  random  variable.  This  is  plotted  in 
Fig.  3.2.  The  function  h(w )  is  given  by  (3.4)  as 

h(w)=  PD(PpA(w  +  l/2))-(w+l/2) 

and  can  be  evaluated  over  [-1/2,  1/2]  as  in  Fig.  3.1.  A  numerical  search  finds  the  maxima  of 
h(w),  which  when  converted  to  c  yields  c.  =  -3.507  and  c+  =  2.493.  The  other  values  become 

W+--  =  0.694  and  =  0.306 

w+  -  w_  w_-  w+ 

so  that  the  optimal  PDF  is 

pc(c)  =  0.306<5(c  +  3.507)  +  0.694 8{c  -  2.493) 

in  agreement  with  the  results  obtained  in  Section  2.0. 
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Fig.  3.2  Plot  of  Pd(c)  versus  Pfa(c). 
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4.0  Reducing  Probability  of  Decision  Error  using  Stochastic  Resonance 

In  this  chapter,  we  address  the  problem  of  reducing  the  probability  of  decision  error  of  an 
existing  binary  receiver  that  is  suboptimal  using  the  ideas  of  stochastic  resonance.  The  optimal 
probability  density  function  of  the  random  variable  that  should  be  added  to  the  input  is  found  to 
be  a  Dirac  delta  function,  and  hence  the  optimal  random  variable  is  a  constant.  The  constant  to 
be  added  depends  upon  the  decision  regions  and  the  probability  density  functions  under  the  two 
hypotheses,  and  is  illustrated  with  an  example.  Also,  an  approximate  procedure  for  the  constant 
determination  is  derived  for  the  mean-shifted  binary  hypothesis  testing  problem. 

Here,  we  consider  the  problem  of  deciding  between  two  hypotheses  Ho  and  H\  that  can 
occur  with  a  priori  probabilities  P[i/o]  =  tto  and  P[//i]  =  Ttj  =  1  =  tto,  respectively.  Our  criterion 
for  performance  will  be  probability  of  error  Pe,  although  the  derivation  is  easily  modified  to 
minimize  the  Bayes’  risk  by  assigning  costs  associated  with  each  decision  [Kay  1998].  It  is 
assumed  that  the  decision  regions  have  already  been  specified,  that  they  are  not  optimal  in  terms 
of  minimizing  Pe,  and  that  a  single  data  sample  x  is  used  to  make  a  decision.  The  already 
specified  decision  regions  may  be  arbitrary  and  hence  our  solution  encompasses  such  regions  as 
if  one  would  decide  H\  if  x  >  a  or  |x|  <  a  as  examples.  The  single  sample  is  usually  a  test 
statistic,  i.e.,  a  function  of  a  set  of  observations,  which  is  a  common  procedure  in  decision 
making.  To  improve  the  performance,  “noisy  sample”  c  is  added  to  form  y  =  x  +  c  prior  to 
decision  making.  We  allow  c  to  be  a  random  variable  and  determine  the  PDF  of  c  that  will  yield 
the  minimum  Pe.  It  is  proven  next  that  the  optimal  PDF  is  a  Dirac  delta  function,  which  leads  to 
the  conclusion  that  the  optimal  random  variable  to  be  added  is  a  degenerate  one,  i.e.,  a  constant. 

4.1  Optimal  PDF  of  Additive  Noise  Sample 

To  write  the  probability  of  error  for  the  original  problem,  we  define  the  decision  rule  (also 
called  the  test  function  or  critical  region  indicator  function)  as 

f  0  decide  Hn 

jzJ(x)  =  \  °. 

[l  decide  Hx 

Then,  we  have 

Pe  =  P[decide  Hx  |  H0  ]P[H0  ]  +  P[decide  H0  \  Hx  ]P[HX  ] 

=  P[^(x)  =  1 1  H0]n0  +  Py>{x)  =  0 1  H1]7rl 

=  nS  <j>(x)px (x)dx+7ix  T  (l-0(x))p*(x)dx 

J-CO  «r— 00 

where  p*(x)  and  pf  (x)  are  the  probability  density  functions  (PDFs)  under  Ho  and  Hi, 
respectively.  This  can  be  written  as 

Pe=^+  £  HX)  [^oPo  (X)  -  7txPf  (X)]  dx  . 
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Now  assume  that  we  modify  x  by  adding  c  so  that  the  test  stistic  becomes  y  =  x  +  c,  where  c  is  a 
random  variable  independent  of  x,  and  whose  PDF  is  pdc)-  Since  the  identical  decision  rule  is  to 
be  used,  we  have 

Pe  =  [n0pl  O)  -  nxp\  O)]  dy . 

But 

pYo  (y)  =  f  Po  (y  -  c)pc  (c)dc 

J-  00 

Pi  (y)  =  £  Pi  (y  -  c)pc  (c)dc  ■ 

We  have  then  that 

Pe  =  *i  +  £  <t>{x)  £  Po  (y  ~  c)Pc  (c)dc  ~  ^i  £  Pi  (y  ~  c)Pc  (c)dc  dy 

=  »,  +  £  £  <t(y )  (Jo/tf  (y~c)~  nxp?  (y  -  c))dy  pc  ( c)dc 
=  +  EC  £  4(y)  {noPo  (y-c)~  *iP?  (y  -  C))dy 

where  Ec  denotes  expected  value  with  respect  to  pdc)-  Hence,  we  wish  to  choose  pdc)  so  that 
the  slightly  more  convenient  form 

J(Pc)  =  Ec  ^J{y){noPo{y-c)-rtiPi{y-c)yy  (4.i) 

is  maximized.  This  is  done  in  the  next  section.  We  will  see  that  the  random  variable  C  may  be 
chosen  as  a  constant  and  therefore  we  need  only  maximize  the  expression  within  the  brackets  of 
(4.1)  over  a  constant  c.  But  this  is  equivalent  to  shifting  <j>(w),  the  decision  region  function  by  -c. 
Hence,  the  solution  effectively  shifts  the  decision  region  by  a  constant.  This  suggests  that 
another  means  for  improving  performance  is  to  transform  the  decision  region  using  a  nonlinear 
transformation  (instead  of  a  simple  shift).  It  can  be  done  by  transforming  the  data  sample  x  using 
a  nonlinear  transformation  g  as  g{x).  This  is  addressed  in  Chapter  8  and  will  be  considered 
further  in  Phase  II. 

4.2  Derivation  of  Optimal  PDF  for  C 

It  is  well  known  that  Ec[g{C)]  is  maximized  by  placing  all  the  probability  mass  at  the  value 
c  for  which  g(c)  is  maximized.  We  assume  that  the  function  g(c )  has  at  least  one  point  at  which 
a  maximum  is  attained.  Calling  this  point  Co  the  optimal  PDF  is  then  pdc )  =  5(c  -  Co),  where 

Cq  =  argc  max  g(c) 
or 
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C0  =  arg£.  max  £ </>(y) (n0pf ( y-c ) - nxpf (y - c))dy . 

A  slightly  more  convenient  form  for g(c )  is  obtained  by  letting  u  =  y-c  so  that 

g(c)  =  £  <f(u  +  c)  (nxpf  ( u )  -  n0pf  ( u))du  (4.2) 

which  is  recognized  as  a  correlation  between  <j>(w)  and  nxpf  (u)-napf  (u) .  In  summary,  we 
should  add  the  constant  c  to  x,  where  c  is  the  value  that  maximizes  the  correlation  given  in  (2). 
Since  the  decision  function  4>(jc)  in  (4.2)  is  completely  general,  the  optimal  solution  is  valid  for  a 
given  binary  decision  rule  with  any  decision  region.  For  example,  if  the  original  decision  rule 
were  to  decide  H\  if  x  >  a,  then  we  would  use  <(>(w)  =1  for  u>  a,  and  zero  otherwise  in  (4.2).  If  it 
were  to  decide  H\  if  |x|  <  a,  then  we  would  use  <|)(w)  =  1  for  |w|  <  a,  and  zero  otherwise  in  (4.2). 
(Note  that  if  <|>(z/)  =  1  for  nxpf  (u)-nQpf  (u)  >  0  and  zero  otherwise,  then  g(c)  is  maximized  for 

c  =  0.  This  is  because  in  this  case  the  decision  rule  <j)(w)  is  optimal.)  In  the  next  section  we  solve 
this  for  a  given  example. 

4.3  The  Gaussian  Mixture  Example 

We  now  consider  the  problem  described  in  [Kay  2000],  but  instead  choose  the  probability  of 
error  criterion.  The  problem  is  to  decide  between  p*  (x)  and  pf  (x)  =  Pq(x-A)  ,  where  A  >  0 
is  a  DC  level  that  is  known  and  the  noise  PDF  is  the  Gaussian  or  normal  mixture 

p*  (x)  =  \  N(x ;  yU,  cr2 )  +  i  N(x;  -p,a2)  (4.3) 

where 


N(x]ju,cr2) 


n-. 


42. 


exp 


na 


1 

2cr2 


(x-m)2 


The  original  decision  rule  is  to  choose  H\  if  x  >  0  so  that  <j)(x)  =  us(x),  where  us(x)  is  the  unit  step 
function.  Additionally,  we  assume  equal  a  priori  probabilities  so  that  no  =  n\  =  Vi.  As  a  result, 
we  have  from  (4.2)  that 

g(c)  =  \^mus(u  +  °){Pi  (u) - Po  (.u)]du 
=  j[c{pf(M)~Po(u))du 

=  I[(l-F’,(-c))-(l-F0(-c))] 

=  i[F0(-c)-F,(-c)] 


where  F,  is  the  cumulative  distribution  function  of  x  under  the  hypothesis  Hi.  For  our  problem, 
we  have  that  pf  (x)  =  pf(x- A)  and  so  F,  (x)  =  F0  (x  -  A) .  Thus, 
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g(c)  =  \[F0(-c)-F0(-c-A)\ 
and  differentiating  and  setting  equal  to  zero  produces 


pZ(-c)=pZ(-c-a) 

or  equivalently  since  p*(x)  is  even,  we  have  the  general  requirement 

Po  (c)  =  Po  (c  +  A).  (4.4) 

Using  (4.4)  in  (4.3)  produces 

p,a2)  +  <j{c\  -p,  a2 )  =  <j){c  +  A;  p,  a2 )  +  <j>(c  +  A; -p,  a2 ) 

which  upon  simplification  yields  the  equation 

exp(pc  /  a2)  +  exp(-pc  /  a2)  =  exp[(-c(v4-//)/cr2)-^2  /(2  cr2)  +  pA/  a2^ 

+  exp  [(-cG4  +  p)/  c2 )  -  A2  /(2a2)-  pA  /  a2  J . 


For  p  =  3,  ct2  =  1,  and  A  -  1,  we  have 

exp(3c)  +  exp(-3c)  =  exp(2c  +  5  /  2)  +  exp(-4c  -1/2). 

The  exact  value  of  c  found  through  a  numerical  search  is  c  =  2.50,  which  could  also  be  found  by 
ignoring  the  terms  exp(-3c)  and  exp(-4c  -  7/2)  since  these  are  nearly  zero  for  this  value  of  c. 
Another  solution  is  found  by  ignoring  the  other  set  of  terms  to  yield  c  =  -3.5.  Note  that  either  of 
these  choices  causes  the  PDFs  of  x  +  c  under  Ho  and  Hi  to  cross  at  the  origin  (see  Figs.  4.1  and 
4.2).  If  we  did  not  have  the  right-most  Gaussian  mode,  then  the  choice  of  c  =  2.5  would  result  in 
a  maximum  likelihood  (ML)  receiver,  which  is  optimum  [Kay  1998].  This  is  because  a 
maximum  likelihood  receiver  chooses  the  hypothesis  whose  PDF  value  is  larger.  In  our  case,  the 
fixed  decision  regions  are  R\  =  {jc:  x  >  0}  for  H\  and  Ro  =  {x\x<  0}  for  Ho  as  shown  in  Fig.4.1. 
These  decision  regions  are  not  optimal.  The  optimal  ML  decision  regions  are  indicated  in 
Fig.4.1  as  R'0  and  R( .  Therefore,  the  region  in  x  for  which  R*0  *  /?*,  which  corresponds  to  the 

dark  PDF  lines,  will  result  in  incorrect  decisions.  By  the  addition  of  c,  however,  the  extent  of 
this  incorrect  decision  region  is  reduced,  as  indicated  in  Fig.4.2. 
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Fig.4.1  Original  PDFs.  The  left-most  PDF  modes  cross  at  x  =  -2.5,  which  is  indicated  by  the 
dashed  vertical  line.  The  fixed  decision  regions  are  indicated  by  J?,  while  the  optimal  ML 
decision  regions  are  indicated  by  R' . 

It  is  instructive  to  also  plot  the  probability  of  error  versus  c  or  equivalently  the  probability  of 
correct  decision  Pc  =  1  -  Pe  versus  c.  This  is  shown  in  Fig.4.3.  Note  that  as  expected  the 
probability  of  a  correct  decision  is  maximized  at  c  =  2.5  and  also  at  c  =  -3.5.  This  type  of  curve 
is  normally  associated  with  the  phenomenon  of  stochastic  resonance,  although  here  we  see  that  it 
is  not  unimodal.  This  result  is  unlike  that  reported  in  [1-3]  and  so  debunks  the  common 
assumption  that  adding  too  much  noise  will  degrade  performance.  The  latter  is  only  true  if  the 
performance  is  unimodal. 
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Fig.4.2  PDFs  after  c  =  2.5  is  added  to  x.  The  fixed  decision  regions  are  indicated  by  Rt  while  the 
optimal  ML  decision  regions  are  indicated  by  R] . 


c  Value 


Figure  4.3  Probability  of  a  correct  decision  versus  the  value  of  the  constant  c  to  be  added  to  the 
data  sample.  The  dashed  lines  are  at  c  =  -3.5  and  c  =  2.5. 


5.0  Application  of  Stochastic  Resonance  to  Nonparametric  Detectors 

Here  we  consider  detection  performance  of  two  additional  nonparametric  detectors  which 
exhibit  improvement  via  additive  SR  noise;  namely,  the  Wilcoxon  and  the  dead-zone  limiter 
detectors.  In  addition  to  the  sign  detector,  these  detectors  were  considered  in  [Chen,  et.  al, 
2006c].  The  asymptotic  efficiency  (AE)  as  well  as  finite  sample  detection  performance  of  these 
SR  modified  detectors  was  reported.  For  large  sample  sizes,  the  AE  of  the  Wilcoxon  and  the 
dead-zone  limiter  [Kassam  1976]  detectors  was  shown  not  to  improve  by  the  addition  of  SR 
noise.  However,  for  finite  sample  sizes,  both  of  these  detectors  show  improvement  in  the 
presence  of  additive  SR  noise. 

Nonparametric  detectors  have  received  considerable  attention  in  signal  detection  problems 
[Kassam  1980],  An  important  feature  of  such  detectors  is  their  guaranteed  level  and  reasonable 
power  for  large  classes  of  input  distributions.  However,  in  most  cases,  a  nonparametric  detector 
is  less  efficient  than  the  optimal  detector.  Therefore,  an  important  consideration  is  the  potential 
improvement  of  their  performance  while  maintaining  their  false  alarm  rate  (CFAR)  property. 
Here,  we  explore  the  potential  detection  performance  improvement  of  several  nonparametric 
detectors  by  adding  SR  noise  to  the  observed  data. 

5.1  Problem  Formulation  for  Nonparametric  Detectors 

Let  us  consider  a  detection  problem  based  on  the  observed  data  vector  x  =  [xi,  X2,  ...,  Xn] 
with  probability  density  function  p(x),  where  the  Xj,  i  =  1,  2,  ...,  N  are  independent  identically 
distributed  (i.i.d.)  scalar  random  variables.  We  decide  between  hypotheses  Hi  and  Ho  given  by 

(5-D 

H, :  p(x)  =  n/*(xi-4> 

i-\ 

where  the  pdf/x(-)  of  the  scalar  random  variable  Xj  is  symmetric,  i ,e.,fx(x)  =  _/x(-x)  and  A  >  0. 
Therefore,  this  test  is  essentially  the  detection  of  a  constant  positive  DC  signal  A  in  additive 
noise,  with  a  symmetric  pdf. 

Following  the  SR  approach,  detection  performance  enhancement  is  achieved  by  adding 
noise  n  =  [nj,  n2,  . . .,  nN]  to  the  original  data  process  x  to  obtain  a  new  vector  y  =  x  +  n,  where  ni 
are  i.i.d.  scalar  random  variables  with  pdf  j^(n).  The  constant  false  alarm  rate  (CFAR)  property 
is  maintained  by  retaining  the  symmetric  pdf  property  of  x.  Therefore,  we  restrict  /n(n)  to  be 
symmetric,  i.e.,/n(n)  =/n(-n).  Given fx(x)  and /n(n),  the  pdf  of  y,  under  the  Ho  hypothesis  can  be 
expressed  by  the  convolutions  of  the  pdfs  such  that 

/y(yS  )=/,(*,  )*/„(«,) 

=  £/x(xi)/„(yi-ni)^i 
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(5.2) 


=  £/x(yi-ni)/n(xi)^i  • 


It  can  be  shown  that  fy( y)  =/v(-y),  i.e.,/y(y)  is  a  symmetric  function.  Therefore,  P( y  >  0|Ho)  = 
1/2.,  and  the  CFAR  property  of  the  nonparametric  detectors  is  maintained. 

The  binary  hypotheses  testing  problem  for  this  new  observed  data  y  can  be  expressed  as: 


H0:p(y)=nm 

.  (5.3) 

Hl:p(y)=flfy(y,-A) 

1=1 


The  cumulative  distribution  function  (cdf)  of  yi  is  given  by 


FV(yi)=  £  £/*(Xi)/n(y.  “Xi>‘M'i 

=  ££/x(xi)/n(yi-xi)^i^i 

=  £/x(xi)Fn(yi  -Xj)£fcj  =  £/n(xi)Fx(yi  -x^  .  (5.4) 


5.2  SR  Detection  Performance  for  Nonparametric  Detectors 

Detection  performance  for  the  three  detectors  is  now  considered  using  stochastic  resonance 
for  the  problem  involving  a  known  DC  level  in  Gaussian  mixture  noise  with  mean  p  =  3  and 
modal  variance  a2  =  1.  In  the  asymptotic  case  where  signal  strength  vanishes  and  sample  sizes 
approach  infinity,  the  performance  is  evaluated  in  terms  of  the  asymptotic  relative  efficiency 
(ARE)  between  the  original  detector  and  the  SR  noise  modified  detector.  Further,  the  ARE  can 
often  be  expressed  as  the  ratio  of  their  asymptotic  efficiencies  given  by 


E  =  lim 

N  — 


{ 

NVarA__0[T(xN)] 


(5.5) 


where  T(-)  is  the  test  statistic.  Similarly,  for  the  finite  sample  case,  we  compare  the  relative 
performances  by  the  deflection  measure  [Picinbono  1995]  which  is  defined  as 


D(T)  = 


var(T|  H0) 


(5.6) 


for  the  sign  and  Wilcoxon  detectors.  For  the  dead-zone  limiter  detector,  we  illustrate  the 
performance  by  several  examples. 
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A.  The  Sign  Detector 


For  the  sign  detector,  we  have  a  test  statistic  and  decision  rule  as  follows 

", 

"o 


where  sgn(x)  is  the  sign  of  x,  given  by 


sgn(x)  = 


x  >  0 
x  <  0 


(5.7) 


(5.8) 


Let  P*  =  P(sgn(x)  =  l|Hj),  i  =  0,  1.  From,  (5.1),  we  have  P0X  =  £  f(x)dx  =  0.5  and 


P*  =  \ [  f(x-A)dx 

=  £ \f{x)dx  =  Fx{A ) 


Q 


'm-A' 


+  2 


ol 


>0.5. 


(5.9) 


Furthermore,  the  test  statistic  Ts  is  binomially  distributed  with  parameter  Pi  under  Hi,  i  =  0,  1 . 
Since,  P0X=  0.5  is  fixed,  therefore  when  P*>  0.5,  the  detection  performance  of  the  sign  detector 
is  monotonically  determined  byP1x,  i.e.,  the  higher  the  value  of P1x ,  the  better  the  detection 
performance  of  (5.7).  It  can  also  be  shown  that  the  expected  value  of  Ts  under  Hi  is  expressed  as 
E(Ts|Hj)  =  N  P* ,  i  =  0,  1  and  the  variance  of  T  under  Ho  is  var(T|Ho)  =  N/4.  The  deflection 
measure  of  the  sign  detector  Ds  is  given  as 


Dx  =  4 N(P*  -  P*  )2  =  4 N(P:*  -  0.5)2 .  (5.10) 

Similarly,  for  y,  we  have  P0y=  0.5  =  P0X  and  DJ  =  4N(Py  -0.5)2 .  However,  due  to  the 
additive  SR  noise  n,  P/  is  changed  such  that 

p/  =  [Af(y)<fy 

=  £  £  fx  (y  -  x)f,  (x)dxdy 

=  £/»w£/J(^ 

=  £  fn(X)FM+X)dx 

J — CO 

=  £  -  f„(.x)[Fx(A  +  x)  +  FX{A- x)~\dx 

J-M  2 


30 


=  f  fn(x)G(x)dx  (5.11) 

J— CO 

The  right  hand  side  of  (5.1 1)  is  obtained  by  applying  the  symmetry  property  of f„(x)  and  letting 
G(x)  =  *1 '  From  (9),  P*  =  G(0).  Let  GM  =  max(G(x))  and  x®  be  the 

minimum  non-negative  x  such  that  G(x?)  =  Gm-  Since  P  fn{x)dx  =  1,  we  haveP,y  <  GM  .  The 

w  J-co 

equality  can  be  obtained  be  selecting  an  optimum  SR  noise  pdf  f°  such  that 


(5.12) 


Therefore,  when  Gm  >  G(0),  the  detection  performance  of  the  sign  detector  can  be  improved  by 
adding  SR  noise.  Correspondingly,  its  deflection  measure  is  such  that  Dy°  >  Df .  Note  that  G(x) 
can  be  expressed  as 


G(x)  =  Fx(A  +  x)  +  Fx(A-x) 

_  Fx( A  +  x)  +  1  -  Fx(- A  +  x) 


2 


=  J_  j_ 
2  +  2 


(5.13) 


Therefore,  for  the  asymptotic  case  where  N  — >  oo  and  A  ->  0,  it  follows  that  G(x)  «  'A  +  A fx(x)  so 
that  G(0)  »  14  +  A/x(0).  As  a  result,  the  asymptotic  detection  performance  can  be  improved  if 
/x(0)  *  max(/x(x)).  The  same  conclusion  can  also  be  obtained  by  evaluating  the  asymptotic 
efficiency  and  the  ARE  between  the  original  detector  and  the  SR  noise  enhanced  detector.  For 
the  sign  detector,  its  asymptotic  efficiency  is  given  by 

E;=  4/x2(0)  (5.14) 


and  similarly 

E?  =  4fy2(0)  =  4[\fx(x)fn(x)dx]\ 


(5.15) 


Again  since  f  f„{x)dx  =  1,  the  optimum  SR  pdf  for  this  case  is 

J-co 


fZ{x)  =  ^5{x-x o)  +  ^-£(*  +  *o) 


(5.16) 
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where  fx(x0)=  max(/i(x)) .  In  general,  we  have  the  ARE  between  the  noise  modified  detectors 
and  the  original  detectors  Ey'x  given  by 


f2(  0) 

£y,x  _  Jy  v  ’ 

4  m o) 


B.  The  Wilcoxon  Detector 

The  Wicoxon  detector  test  statistic  is  expressed  as 


(5.17) 


N  i=j 


Tw=  2Ssgn(xi+xj)^- 

j= i  >=i 


< 


(5.18) 


In  the  asymptotic  case,  the  asymptotic  efficiency  of  the  original  Wilcoxon  detector  ££  is  given 


by  £*  =  12 


-|2 


J”  fx2(x)dx  .  Let  Hx{co)  =  (x) exp(-j/vx)dx  be  the  Fourier  transform  of 

fx(-).  Since /X(x)  >  0  is  a  symmetric  real  function,  Hx(co)  =  P  fx(x)cos(o)x)dx  is  also  a  real 

J-oo 

function.  From  Parseval’s  theorem,  we  have 


Exw  =  12 


r  fx^dx  T  =  -[  r 

J-co  77-  J-co 


(5.19) 


Similarly,  the  asymptotic  efficiency  of  the  SR  noise  modified  detector  E*,  can  be  expressed  as 


E*  =  12 


^Jy(y)dy\  =  ~[[Hi(oj)dco 


(5.20a) 


From  (5.2),  we  have  Hy{co )  =  Hx(a> )  Hn{co) ,  so  that 
6 


£>■  = 


n 


H2(co)H2n(co)dco 

J-oo  ^ 


(5.20b) 


and  the  ARE  between  the  noise  modified  detectors  and  the  original  detectors  is  given  by 

(£/»<$') 


Ey-X 


(£/»*) 


(5.21) 


Note  that 
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|tf«(<0)|  = 


r  fn(x)exp(-jo)x)dx 

J-co 


=  f  f  (x)  cos(cox)dx 

1  J-  co 

<  £/„(*)  I  cos(<yx)|<&: 

<  [ \f„(x)dx  =  1. 

furthermore,  ££<£*.  In  other  words,  in  the 

0,  the  Wilcoxon  detector  performance  cannot  be 
However,  the  detection  performance  may  still  be 

C.  The  Dead-Zone  Limiter  Detector 


Therefore,  we  have  Hj(co)  <  H2x(co)  and 

asymptotic  limit  where  N  ->  oo  and  A 
improved  by  adding  independent  SR  noise, 
improved  in  the  finite  sample  case. 


The  dead-zone  limiter  detector  [Kassam  1976]  employes  the  dead-zone  limiter  characteristic 
/c  to  operate  on  the  data  where  /c(  )  is  given  by 


fl  x>c 


/.(*)  = 


0 

-1 


-c  <  X  <  c 


x<c 


(5.22) 


where  c  is  a  prescribed  positive  number.  Let  Ncp  be  the  number  of  samples  which  satisfy  Xj  >  c 
and  Nc  be  the  number  of  samples  which  satisfy  |xsj  >  c.  In  order  to  obtain  a  false  alarm  rate,  a, 
the  dead-zone  limiter  detector  selects  the  Hi  hypothesis  with  probability  one  when  Ncp  >  ga(Nc) 
and  with  probability  (3a(Nc)  when  Ncp  =  ga(Nc).  Both  ga(Nc)  and  pa(Nc)  are  suitable  functions 
such  that  the  false  alarm  rate  is  fixed  at  a.  For  the  dead-zone  limiter  detector  with  parameter  c, 
assuming  Fx( c)  <  1  and /x(c)  is  continuous  at  c,  we  have  its  asymptotic  efficiency  EXDZ  given  by 


£»  =2./*2(c) 
^ DZ  L 


1  -F,(c) 

Thus,  for  the  SR  noise  modified  detector,  its  asymptotic  efficiency  is  expressed  as 

fliP) 


(5.23a) 


Ey  =  2 

^ DZ  ^ 


1  -K(c) 


(5.23b) 


The  ARE  between  the  SR  noise  modified  dead-zone  limiter  detector  and  the  original  dead-zone 
limiter  detector  is  given  by 
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DZ  fx(c)/(\-Fx(c)) 
Using  (5.2)  in  (5.23b),  we  have 


2(  \fxic-x)fH(x)dx^ 

1  ~  \Fx(c-x)fn(x)dx 

2(j/,(c -*)/„(*)<&) 

J(1  ~Fx(c-x))fn(x)dx 

2(\fx2(C~X)fn(X)dx) 

j(l-Fx(c-x))f„(x)dx 


(5.24) 


(5.25) 


Let  k  =  max  — — .  Therefore,  when  c  is  selected  to  maximize  El7  ,  i.e.,  k  = 

1  -Fx{c-x)  “  l-^(c) 

we  have  ££z  <  £*z  .  Thus,  the  asymptotic  efficiency  of  the  tuned  dead-zone  limiter  detector 
with  optimal  parameter  c  cannot  be  improved  by  adding  SR  noise.  However,  (5.25)  does  not 
rule  out  the  possible  SR  effect  when  c  is  not  optimum.  Furthermore,  similar  to  the  Wicoxon 
detector,  for  the  finite  sample  and  vanishing  signal  case,  the  detection  performance  of  the  dead- 
zone  limiter  detector  may  still  be  improved  by  adding  suitable  SR  noise. 


5.3  Experimental  Results 


Here  we  consider  the  detection  of  a  known  DC  signal  in  symmetric  Gaussian  mixture  noise; 
i.e., 


fx  (*)  =  \y(x;  ~F,  tf)  +  j  y(x\  Fial) 


where 


T(x;ju,<r  o) 


_l _ 

V27rcr2 


(x-F)2 

2<j 2 


is  the  PDF  of  a  Gaussian  random  variable  with  mean  p.  and  variance  a2.  We  consider  two  types 
of  SR  noise.  These  include  the  symmetric  two-peak  random  noise  with  two  random  values 

fs(x)  -  0.55(x  -  x)  +  0.58(x  +  x)  Symmetric  two-peak  SR  noise 

and  white  Gaussian  SR  noise 


fg(x)  =  y(x;0,x2). 


Gaussian  SR  noise 
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The  noise  modified  data  processes  are  denoted  as  ys  and  yg,  respectively.  In  this  example,  we  set 
Go  =  1,  p.  =3,  and  the  sample  size  N  =  5.  From  (5.2)  we  have  the  PDF  of  yg  given  by 

fys  O')  =  j  /to  -  A  o-02  +  r2 )  +  i  y(y;  //,  cr02  +  r2 )  (5.26) 

and  the  PDF  ofy* 

fy , O')  =  j r(y> -m - r, i y(y; -m +r,v20)+^r(y,M- t, \ r(y; n + *■, o-2 ) .  (5.27) 

Next,  we  evaluate  both  the  asymptotic  detection  performance  and  the  finite  sample  detection 
performance  for  this  particular  detection  problem  for  the  three  nonparametric  detectors. 

A.  Asymptotic  Detection  Performance 

In  this  section,  the  asymptotic  efficiency  of  the  three  SR  modified  nonparametric  detectors 
foe  both  SR  noises  are  obtained  and  plotted  in  Fig.  5.1.  The  detection  performance  of 
nonparametric  detectors  based  on  ys  and  yg  are  denoted  as  Ey-  and  Ey‘ ,  respectively.  For  the 
dead-zone  limiter  detector,  two  different  c  values  are  examined.  One  is  the  optimum  value  of  the 
dead-zone  limiter  Co  =  3.61  for  the  problem  considered  here  and  its  corresponding  values  are 

shown  as  E$z  and  EyDgz  in  the  figure.  The  other  parameter  is  c,  =  0.6 1^2  +  cr02  =  1.929  which 

is  the  optimum  value  assuming  that  the  noise  is  Gaussian  distributed  with  the  same  variance  as 
that  in  our  example.  In  both  Fig. 5. la  and  Fig.5.1b,  the  asymptotic  efficiency  of  the  Wilcoxon 
detector  and  the  dead-zone  limiter  with  co  =  3.61  is  maximum  when  r  =  0,  i.e.,  in  the  limit  of 
large  data  samples,  the  detection  performance  of  the  optimal  dead-zone  limiter  and  the  Wilcox 
detector  cannot  be  improved  by  adding  SR  noise.  However,  for  the  sign  detector  and  the 
suboptimal  dead-zone  limiter  (ci  =  1.929),  their  detection  performance  can  be  enhanced!  For  the 
sign  detector  based  on  ys,  from  (5.16)  we  have  fx( to)  =  ma x(fx(x)).  Since  fx  is  a  symmetric 
Gaussian  mixture  noise  and  2p  =  6ao,  the  distance  between  the  two  peaks  is  significantly  larger 
than  their  variances  and  the  maximum  value  of  fx  is  reached  at  the  mean  value  of  each 
component  of  the  mixture,  i.e.,  ts  =  p  =  3.  Thus,  we  have  the  maximum  achievable  asymptotic 
efficiency  E =  0.1592.  Compared  to  the  original  Sign  detector  which  has  a  low  value  of 

asymptotic  efficiency  Exs  =  7.8565xl0'5,  the  ARE  between  these  two  detectors  is  E$”x  =  2026, 

i.e.,  by  adding  suitable  SR  noise,  the  detection  performance  of  the  sign  detector  is  enhanced  by 
more  than  a  factor  of  2000.  Similarly,  for  the  Gaussian  SR  noise  case,  we  have  the  optimum 

rg  =  -  cr02  =  2.8284.  Furthermore,  the  maximum  achievable  asymptotic  efficiency  when  SR 

Gaussian  noise  is  added  is  Eys*  =  0.026  and  the  corresponding  ARE  is  Eysz'x  =331. 

For  the  Wilcoxon  detector,  the  results  shown  in  Fig. 5.1  are  to  be  expected.  When  t  >  0,  the 
efficiencies  E$  and  Eyvg  are  always  less  than  E *  ,  i.e.,  adding  any  noise  to  the  observation  data 

will  only  degrade  the  detection  performance.  The  same  conclusion  can  be  drawn  for  the  dead- 
zone  limiter  detector  with  the  optimal  parameter  c0  =  3.61.  However,  for  the  dead-zone  limiter 
detector  with  suboptimal  c\  =  1.929,  as  shown  in  Fig. 5. la  and  Fig.5. lb,  when  r  is  relatively 
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small,  EDz,\  increases  when  r  increases,  i.e.,  the  detection  performance  is  improved  by  adding 
suitable  SR  noise.  For^,  the  maximum  E%z  l  is  E%ZA  =  0.0657  with  parameter  t%z  =  0.92  and 

the  maximum  EyDsz ,  =  0.0614  is  achieved  with  parameter  ryJ7  =  0.590  for  yg.  The  ARE  for  both 

cases  are  Efy]  =1.116  and  EyJz\  =1.0416  for^  and yg,  respectively. 


Fig.  5.1  Asymptotic  Efficiency  of  the  SR  Noise  Modified  Nonparametric  Detectors,  (a)  based  on 
ys,  (b)  based  on  yg. 
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B.  Finite  Sample  Size  Detection  Performance 

In  this  section,  we  consider  two  different  values  of  A  to  examine  the  detection  performance 
of  the  nonparametric  detectors.  For  the  sign  and  Wilcoxon  detector,  A  =  1  is  assumed  in  the  first 
example.  From  (5.9),  we  have 


r; 


=w=|e 


( 

jx-A 


+-Q 


( 


-/x-  A 


x 


r  i 

t2' 

J  F4 —  exP 

"  v2tt 

2  _ 

Gaussian  distribution.  Therefore,  we  have  in  this  case, 
Fx(A  +  x)  +  Fx(A-x ) 


Gx(x)  = 


=  -Q 
4* 


ju-A-x 


+  -Q 
4* 


/x-A  +  x 
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+  -Q 
4 


/  - /x-A-x ^ 


cr„ 


+-Q 
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-/x-A  +  x 


( Tn 


(5.28) 


Taking  the  derivative  w.r.t.  x,  setting  it  equal  to  zero  and  solving,  we  have  *o  ~  p  =  3.  The 
optimal  SR  noise  PDF  for  the  sign  detector  is  f„°(x)  =  j<5(x  +  3)  +  ~S(x-3).  Thus,  we  have  Pxy° 

=  0.6707  and  the  maximum  deflection  measure  Dys°  =  0.5826.  The  relationships  between  the 

deflection  measure  D  and  t  for  the  sign  and  Wilcoxon  detectors  obtained  by  Monte  Carlo 
simulation  are  shown  in  Fig. 5.2.  Clearly  both  curves  achieve  their  peak  values  when  r  >  0,  i.e., 
an  improvement  in  detection  performance  is  obtained  when  suitable  noise  is  added.  The 
deflection  measure  of  the  SR  noise  modified  sign  detectors  for  both  types  of  SR  noises  improves 
when  the  SR  noise  is  suitable.  On  the  other  hand,  for  the  Wilcoxon  detector,  it  is  a  different 
story.  As  shown  in  Fig.5.2a,  when  the  SR  noise  is  a  randomization  of  two  values,  the  deflection 
measure  Dy;  may  be  larger  than  that  of  the  original  detector  as  noted  by  the  peak  at  r  =3. 
However,  this  is  not  the  case  when  the  additive  SR  noise  is  Gaussian.  As  shown  in  Fig. 5.2b, 
DyvK  decreases  monotonically  as  r  increases,  i.e.,  for  the  Wilcoxon  detector,  the  SR  phenomenon 
is  not  observed  for  Gaussian  SR  noise. 
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Deflection  Measure  of  y 


Standard  Deviation  x 


Fig.  5.2  Detection  performance  for  the  sign  detector  and  the  Wilcoxon  detector  using  a  finite 
sample  size  N  =  5  and  signal  strength  A  =  1 ;  a.)  based  on  ys  b.)  based  on  yg. 


For  the  dead-zone  limiter  detector,  we  chose  a  relatively  strong  signal  with  A  =  4.  Again, 
two  dead-zone  limiter  detectors  with  parameters  c0  =  3.61  and  c\  =  1.929  are  tested.  Since  the 
dead-zone  detector  is  a  conditional  detector,  unlike  the  other  two  detectors,  it  is  difficult  to 
calculate  its  deflection  measure.  To  evaluate  its  relative  detection  performance,  we  use  the 
detection  power  p  of  the  SR  noise  enhanced  dead-zone  limiter  while  keeping  the  false  alarm  rate 
a  =  0.1  fixed.  Intensive  Monte  Carlo  simulations  were  performed  to  obtain  the  detection 
performance.  As  shown  in  Fig.5.3a,  for  both  dead-zone  limiter  detectors  based  on  observation 
ys,  the  detection  performance  increases  as  r  >  0  increases  and  reach  their  peaks  when  r  «  2.8 
and  t  «  2.1  for  c0  and  cj,  respectively.  When  the  additive  SR  noise  is  Gaussian,  maximum 
detection  powers  p  are  obtained  when  z  w  1.8  and  z  *1.4,  respectively.  Also,  by  comparing  the 
maximum  values  from  these  two  figures,  we  find  that  the  detection  performance  of  ys  is  better 
than  that  ofyg,  i.e.,  in  this  case,  adding  two  peak  random  SR  noise  is  better  than  adding  Gaussian 
SR  noise. 


Fig.  5.3a.  Probability  of  detection  versus  standard  deviation  x  based  on  ys  for  the  dead-zone 
limiter  detector  with  sample  size  N  =  5,  signal  strength  A  =  4,  and  false  alarm  rate  a  =  0.1. 
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b. 

Fig.  5.3b.  Probability  of  detection  versus  standard  deviation  x  based  on  yg  for  the  dead-zone 
limiter  detector  with  sample  size  N  =  5,  signal  strength  A  =  4,  and  false  alarm  rate  a  =  0.1 . 

5.4  Summary  of  Results  for  SR  Enhanced  Nonparametric  Detectors 

In  this  chapter,  we  have  investigated  the  performance  of  the  SR  noise  enhanced  sign, 
Wilcoxon,  and  dead-zone  limiter  detectors.  The  asymptotic  efficiency  as  well  as  the  finite 
sample  detection  performance  of  the  SR  modified  detectors  are  obtained.  It  has  been  shown  that 
for  the  sign  detector,  an  improvement  of  the  asymptotic  efficiency  is  possible  under  certain 
condition.  For  the  Wilcox  detector,  there  is  no  SR  effect  in  terms  of  the  asymptotic  efficiency. 
For  the  dead-zone  limiter  detector,  a  similar  conclusion  is  obtained  when  its  c  parameter  is 
optimal.  However,  when  c  is  not  the  optimal  value,  it  is  still  possible  to  improve  its  asymptotic 
efficiency  by  adding  suitable  SR  noise.  Also,  as  shown  in  our  detection  example,  it  is  possible  to 
improve  the  performance  of  these  detectors  when  only  limited  data  samples  are  available. 
Specifically,  a  remarkable  result  shown  for  this  problem  is  the  fact  that  whereas  the  Wilcoxon 
detector  is  far  superior  to  the  sign  detector  without  SR  noise  (t  =  0),  the  presence  of  the  optimal 
SR  noise  enables  the  sign  detector  to  significantly  outperform  the  Wilcoxon. 

Overall,  SR  provides  an  important  approach  to  enhance  the  performance  of  commonly  used 
nonparametric  detectors.  Similar  approaches  can  be  employed  for  other  nonparametric  detectors. 
Further  issues  such  as  the  determination  of  the  optimum  SR  noise  PDF  for  a  larger  set  of 
nonparametric  detectors  are  under  investigation. 
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6.0  Application  of  Stochastic  Resonance  to  Imagery 
6.1  Image  Quality  Metrics 

In  Phase  I,  a  state-of-the  art  assessment  was  conducted  on  image  quality  metrics  to 
determine  if  they  can  provide  an  approximate  and  more  efficient  method  of  obtaining  quality 
scores  provided  by  human  observers.  Important  considerations  involve  image  noise  consisting 
of  (a)  random  image  noise  such  as  impulsive  and  additive  noise  as  well  as  (b)  structural  noise 
consisting  of  artifacts  in  the  imagery.  Another  important  consideration  is  the  degradation  of 
image  sharpness  due  to  blurring,  ringing,  and  blocking  effects.  The  advantages  of  the  image 
quality  metrics  is  that  they  are  able  to  dynamically  monitor  and  adjust  image  quality,  optimize 
parameters  and  algorithms,  and  benchmark  image  processing  systems  and  algorithms. 

Objective  quality  metrics  can  be  placed  in  three  general  categories.  These  consist  of  (a)  full 
reference,  (b)  no  reference,  and  (c)  reduced  reference.  Full  reference  metrics  provide  a  so-called 
‘perfect  version’  in  the  sense  that  ‘ground  truth’  or  a  ‘golden  image’  is  available  as  a  reference. 
Here,  the  image  quality  is  measured  by  comparing  the  difference  between  the  reference  and  the 
distorted  image.  For  the  ‘no  reference’  case,  no  such  reference  is  available  so  that  the  image 
itself  must  be  assessed.  Finally,  the  ‘partial  reference’  methods  utilize  available  partial 
information  regarding  the  ‘perfect  image’  as  in  the  case  of  image  fusion  processing  where 
multiple  images  are  utilized. 

Several  image  comparison  metrics  have  been  proposed  to  compare  the  similarity  between 
different  images.  Among  the  most  prevalent  is  the  mean-squared  error  metric  which  is  an 
average  of  the  sum  of  squares  of  the  pixel  value  differences  between  each  image  and  is  expressed 
as 

1  N  M 

MSE=j^'LY\. (si) 

where  X(iJ)  and  Y(ij)  are  the  ijth  pixel  values  for  images  X  and  Y,  respectively. 

Due  to  its  important  statistical  meaning,  Mutual  Information  (MI)  is  also  widely  used  as 
an  evaluation  measure  which  is  given  by 

I  =  H(X)  +  H(Y)-H(X,Y)  (6.2) 

where  //(•)  is  the  entropy.  Although  it  can  been  shown  that  for  I  -  H(X) ,  Y  is  identical  to  X , 
MI  is  not  directly  related  to  human  perception  performance;  i.e.,  MI  does  not  adequately  indicate 
the  visual  perception  performance  [Chen  et  al.  2005].  Also,  since  the  calculation  of  MI  involves 
the  estimation  of  a  joint  PDF  between  X  and  Y,  MI  is  often  not  suitable  for  a  relatively  small 
sized  image  where  the  sample  support  is  low.  Alternatively,  it  is  difficult  to  extend  MI  to  handle 
a  high  dimensional  image  dataset.  A  related  quantity,  however,  called  the  visual  information 
fidelity  metric  is  considered  below. 

Structural  Similarity  (SSIM)  [Z.  Wang,  et  al.  2004],  a  nonlinear  combination  of  the 
difference  in  terms  of  the  mean,  variance,  and  contrast  is  also  widely  used  and  is  expressed  as 
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(6.3) 


where  DU  is  the  distorted  image  information  and  Ril  is  the  reference  image  information.  A  signal 
gain  and  additive  noise  model  in  the  wavelet  domain  given  by 

D  =  GC  +  V  (6.6) 

is  applied  to  describe  the  test  image,  where  C  denotes  the  random  field  (RF)  from  a  wavelet  sub¬ 
band  in  the  reference  image  and  D  denotes  the  corresponding  signal  in  the  test  image.  G  is  a 
deterministic  scalar  gain  and  V  is  white  Gaussian  noise.  A  Gaussian  distribution  assumption  is 
then  made  to  calculate  the  mutual  information.  Overall,  DU  and  Ril  are  obtained  as  the 
summation  of  the  mutual  information  of  all  sub-bands  between  the  distorted  image  and  the  true 
image  C  and  between  the  reference  image  and  C,  respectively.  Examples  of  the  VIF  metric  are 
shown  in  Fig.  6.1  for  VIF  values  of  1.0  (reference  image),  1.1,  0.07,  and  0.10. 


Reference  VIF  =  1.0  Contrast  Enhanced  VIF  =  1.10 

Fig.  6.1  Examples  of  the  Visual  Information  Fidelity  (VIF)  metric  (continued  on  next  page). 
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Blurred  VIF  =  0.07  JPEG  Compressed  VIF  =  0.10 

Fig.  6.1  (continued)  Examples  of  the  VIF  metric. 

The  ‘No-Reference  (NR)  Quality  Assessment’  algorithms  [Z.  Wang,  et  al.,  2000],  [Sheikh, 
et  al.,  2005]  were  recently  developed  to  evaluate  image  quality  without  a  reference  image;  i.e., 
when  no  reference  image  is  available.  They  are  based  on  a  quantization  of  the  image  distortion. 
Specifically,  the  edge  sharpness  level  quantization  relates  to  the  edge  width  and  the  blur  level 
estimation  and  noise  level  estimation  is  affected  by  the  impulse  noise  level  and  the  additive 
Gaussian  noise  level.  Some  existing  NR  metrics  consist  of  image  ringing  and  blurring  measures, 
image  noise  level,  as  well  as  receiver  operating  characteristic  (ROC)  curves  as  in  medical  image 
processing  where  the  detection  performance  of  a  certain  disease  is  considered. 

Finally,  the  Adaptive  Image  Quality  Measure  (IAQM)  has  been  considered  [Bingabr, 
Varshney,  Farell,  2003]  as  a  measure  which  provides  the  peak  signal-to-noise  ratio  computed 
after  eliminating  the  errors  not  seen  by  the  eye  and  the  extent  (in  terms  of  percentage  of  blocks) 
to  which  the  image  is  corrupted. 

In  future  work,  we  will  utilize  such  metrics  to  improve  the  filtered  image  quality  and 
determine  the  best  parameters  for  the  SR  based  approach  as  in  cases  such  as  the  median  filtered 
image  discussed  in  Section  2.0.  By  doing  so,  automatic  filtering  algorithms  will  be  developed 
and  it  is  anticipated  that  performance  will  be  improved  over  state-of-the-art  methods. 

6.2  Detection  Enhancement  in  Imagery 

We  again  emphasize  that  the  specific  form  of  the  noise  PDF  plays  a  critical  role  to  achieve 
enhancement  via  SR.  We  reconsider  the  problem  addressed  in  [Kay  2000],  but  now  for  a  two- 
dimensional  (150x150  pixel)  image.  In  Fig.  6.2,  a  6x6  window  is  used  to  process  the  data  so  that 
N  =  36  pixels.  Fig.  6.2a  shows  the  original  image  with  no  additive  noise  while  Fig.  6.2b  shows 
the  image  with  additive  Gaussian  mixture  noise.  Fig.  6.2b  is  now  used  as  the  baseline  observed 
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image  containing  signal  and  noise.  In  Fig.  6.2c,  the  sign  detector  processes  the  image  of  Fig. 
6.2b  and  in  Fig.  6.2d  we  observe  a  modest  image  visualization  performance  improvement,  the 
result  of  adding  white  SR  Gaussian  noise  with  a  variance  of  2.0. 


In  Figs.  6.2e  and  6.2f,  we  consider  the  problem  from  a  detection  viewpoint  showing  binary 
detection  results  for  the  sign  detector  without  and  with  Gaussian  SR  noise,  respectively.  Here, 
we  compare  the  test  statistic  values  to  a  threshold  to  decide  signal  presence  (white  region)  or 
signal  absence  (dark  region).  The  figures  demonstrate  the  SR  effect  on  detection  performance 
improvement.  Specifically,  the  detection  performance  within  the  signal  region  has  improved. 


Detection  without  SR  noise 


Detection  with  Gaussian  SR  noise 


1 

.V  %  ''A'**  \ 

;-f  if;. 

I 

■  HK 

■dt 

\  : 

i  >  * 

.■  S 

1 

MM  ffiRBL 

- 

Fig.  6.2  Image  enhancement  using  Gaussian  SR  white  noise,  a.)  signal  image,  b.)  signal  image 
plus  Gaussian  mixture  noise,  c.)  sign  detector  test  statistic,  d.)  sign  detector  test  statistic  with 
Gaussian  white  SR  noise,  e.)  detection  using  sign  detector  without  SR  noise,  f.)  detection  using 
sign  detector  with  Gaussian  SR  noise. 


Test  Statistic  with  Discrete  SR  noise 


Test  Statisttc  with  Discrete  SR  noise 


c.  d. 

Fig.  6.3  Image  enhancement  using  discrete  SR  noise  PDF  with  ni  =  -4.0  and  r\2  =  2.0,  a.)  test 
statistic  values,  X  =  0.05,  b.)  test  statistic  values,  X  =  0.1587,  c.)  detection  results,  X  =  0.05,  d.) 
detection  results,  X  =  0.1587. 

In  Fig.  6.3,  however,  we  repeat  the  results  using  the  discrete  PDF  as  in  (2.17),  but  now  using 
ni  =  -4.0  and  n2  =  2.0.  Figs.  6.3a  and  6.3b  show  the  test  statistic  values  with  probability  values  X 
=  0.05  and  0.1587,  respectively.  The  results  reveal  a  noticeable  improvement  in  the  image 
visualization  quality  when  compared  to  Fig.  6.2d.  Figs.  6.3c  and  6.3d  show  corresponding 
binary  detection  results  with  significant  enhancement  over  the  results  for  Gaussian  SR  noise 
shown  in  Fig.  6.2f.  A  comparison  of  Figs.  6.3c  and  6.3d  further  reveal  the  impact  of  the 
parameter  X  to  control  PFyA  and  Py  levels. 

Finally,  we  consider  the  application  of  the  theoretical  SR  detection  framework  to  an  actual 
image  to  determine  the  image  visualization  improvement.  In  Fig.  6.4,  we  consider  the  ‘Lena’ 
image  with  the  original  image  shown  in  Fig.  6.4a  and  the  image  with  a  high  threshold  level 
applied  in  Fig.  6.4b.  In  practice,  the  latter  would  represent  an  incorrect  binary  threshold  or 
perhaps  a  human  ‘eye-detector’  with  a  damaged  neuron  requiring  a  high  excitation  level.  In  Figs. 
6.4c,  d,  and  e,  specific  cases  of  Gaussian,  uniform,  and  optimal  discrete  SR  noise  are  considered, 
respectively.  The  results  demonstrate  the  potential  for  dramatic  image  visualization  improvement 
with  the  application  of  the  appropriate  SR  noise  PDF.  In  phase  II,  attention  will  be  given  to  the 
‘a  priori’  information  considerations  required  to  achieve  the  improvements  in  practice. 
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c.  d.  e. 


Fig.  6.4  Image  visualization  using  the  ‘Lena’  image  a.)  original  Lena  image,  b.)  image  with  a 
high  threshold  applied,  c.)  Gaussian  SR  noise,  d.)  uniform  SR  noise,  e.)  optimal  discrete  SR 
noise. 
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7.0  Application  of  Stochastic  Resonance  (SR)  to  Distributed  Detection 

Despite  the  progress  achieved  over  the  past  two  decades,  the  application  of  the  SR  effect  to 
distributed  detection  has  not  been  shown.  Here,  we  investigate  this  application  area  for  the  dual 
hypothesis  detection  problem.  We  restrict  ourselves  to  binary  local  sensor  outputs,  denoted  by 
Uk,  and  consider  the  cases  of  both  conditional  independence  and  dependence  among  sensor 
observations.  The  degradation  of  detection  performance  caused  by  transmission  errors  between 
local  sensor  outputs  and  the  fusion  center  is  assessed.  The  relationship  between  the  additive  SR 
noise  and  system  performance  is  explored.  For  the  traditional  two-stage  approach  using  the 
Chair-Varshney  fusion  rule  [Chairl  986],  the  role  of  additive  SR  noise  at  both  the  decoding  stage 
and  the  decision  stage  is  discussed.  We  show  that  the  SR  phenomenon  exists  under  certain 
circumstances  for  both  cases. 

7.1  Stochastic  Resonance  Problem  Statement 

We  again  summarize  the  mathematical  framework  here  for  convenience.  Given  a  K- 
dimensional  data  vector  x  e  RK,  we  decide  between  two  hypotheses  H]  and  Ho, 

H0:  />x(x;H0)  =po(x)  (7-la) 

Hi:Jpx(x;Hi)=p,(x)  (7.1b) 

where  p0(x)  and  p\(x)  are  the  pdfs  of  x  under  H0  and  Hi,  respectively.  In  order  to  make  a 
decision,  a  test  that  can  be  completely  characterized  by  a  critical  function  {decision  function )  <]) 
where  0  <  <j)(x)  <  1  for  all  x,  is  required  to  choose  between  the  two  hypotheses.  For  any 
observation  x,  this  test  chooses  the  hypothesis  Hi  with  probability  <|)(x).  The  detection 
performance  of  this  test  <(>(•)  can  be  evaluated  by  its  probability  of  detection  PD  and  probability  of 
false  alarm,  Pfa.  In  order  to  enhance  detection  performance,  we  add  SR  noise  to  the  original 
data  process  x  and  obtain  a  new  data  process  y  given  by 

y  =  x  +  n,  (7.2) 

where  the  n  is  either  an  independent  random  process  with  pdf /?„(•)  or  a  nonrandom  signal.  For 
the  case  where  the  critical  function  (}>(•)  is  fixed,  to  improve  Pd  without  increasing  Pfa,  the 
optimum  SR  noise  has  been  shown  to  consist  of  no  more  than  two  discrete  vectors  [Chen 
2006a, f];  i.e.,  the  optimal  SR  noise  pdf  p°ap,{  n)  is  of  the  following  form, 

PT  (n)  =  T£(n  -  n.)  +  (1  -  X)S(  n  -  n2)  (7.3) 

where  0  <  X  <  1,  iti  and  112  are  suitable  K  dimensional  vectors. 

7.2  Decision  Fusion  and  non-ideal  Transmission  Channels 

A  typical  parallel  fusion  model  with  transmission  channels  is  shown  in  Fig.  7.1.  For  the  kth 
local  sensor,  an  independent  binary  decision  uk  is  made  based  on  its  observations.  Without  loss 
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of  generality,  assume  that  uk=  0  if  Ho  is  decided  and  uk=  1,  otherwise.  The  detection 
performance  of  the  kth  local  sensor  node  can  be  characterized  by  its  corresponding  probability  of 
false  alarm  and  detection,  denoted  by  PFAk  and  PDk ,  respectively.  The  kth  local  decision  uk  is 
sent  to  the  fusion  center  through  a  transmission  channel  Ck  characterized  by  p(xk  \  uk) .  Yhe 
final  decision  «0  is  made  at  the  fusion  center  based  on  the  received  data  x  =  [xi,  X2,  ...,  xk]  and 
the  fusion  rule  y,  i.e., 

t/0  =  y(xi,x2,  ...,  xK).  (7.4) 


Fig.  7.1  The  parallel  fusion  model. 

In  general,  two  different  fusion  rules  are  applicable  at  the  fusion  center  depending  on  the 
different  definitions  of  the  output  x.  For  the  traditional  two-stage  approach,  the  output  of  each 
transmission  channel  Xk  is  the  estimate  ofw*;  i.e.,  the  kth  channel  can  be  described  as  a  binary 

channel  with  crossover  error  probabilities  a*  and  Pk.  The  fusion  rule  yi,  assuming  perfect 
connections  between  the  local  sensors  and  the  fusion  center,  is  given  by 


r.-I 


log 


0 zIfMl 

0“  ^Dk)^FAk 


7,- 


(7.5) 


Further,  by  applying  the  channel  model  for  the  signal  detection  problem  for  the  kth  local  sensor 
and  its  corresponding  channel  Ck ,  the  relationship  between  Xk  and  the  hypothesis  Hj  can  be 

described  as  a  two-layer  transformation  channel  shown  in  Fig.  7.2  with  its  equivalent  one-layer 
model  shown  in  Fig.  7.3.  After  accounting  for  channel  errors,  the  equivalent  kth  sensor 
probability  of  detection  P'k  and  probability  of  false  alarm  PFAk  are  given  by 


Pm  =  (1  '  Pk)  Ppk  +  ak(l  *  Ppic )  (7-6) 

and 

?FAk  =  (1  "  Pk)  PFAk  +  «k(l  -  PFAk)-  (7.7) 
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Fig.  7.3  Channel  model  for  a  signal  detection  problem  in  local  sensor  k. 

Therefore,  when  the  channel  statistics  otk  and  pk  are  available,  the  optimum  fusion  rule  72  at  the 
fusion  center  is  given  by  the  Chair- Varshney  rule  [Chair  1986] 


k= 1 


n 2  ■ 


(7.8) 


When  the  channel  is  perfect,  i.e.,  ak  =  Pk  =  0,  we  have  P^k  =  PDk  and  PFAk  =  PFAk  while  j2  =  y\- 

Compared  to  the  two-stage  fusion  approach  where  the  output  of  channel  k  is  the  binary 
estimate  of  Uk,  channel  aware  decision  fusion  rules  have  been  developed  recently  [B.  Chen 
2004],  [Niu  2006]  based  on  the  direct  observation  of  channel  data.  In  this  approach,  the  output 
of  channel  Ck  is  no  longer  a  binary  variable  but  a  continuous  random  variable.  The  exact  form  of 
p(xk  |  uk )  depends  on  the  coding  rule  at  each  local  sensor  and  its  corresponding  noisy  channel 
model.  For  example,  for  many  wireless  sensor  network  scenarios,  the  Mi  channel  between  the 
Mi  local  sensor  and  the  fusion  center  can  be  modeled  as  a  unit  power  Rayleigh  fading  channel 
consisting  of  additive  Gaussian  noise  with  variance  crk .  Assuming  that  the  local  sensors  send  a 

1  when  Hi  is  decided  and  -1  otherwise,  the  channel  statistic  p(xk  \  uk)  is  given  by 
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(7.9) 


and 


P(xk  I  uk  =  1)  = . 7==^-  —  exp(- [  1  +  'Jl7rakxk  exp(a2*2  / 2)0(-at*t )] , 

\/2;r(l  +  2cr2)  2of  l  j 

p(xk  \uk  =  0)  =  y— —  exp(- ~~) [l ~ 'J2nakxk  exp(a2*2 /2)Q(akxk)\  ,  (7.10) 

V 2;r(l  +  2<r2)  2cr2  l  j 


where  a,  = - ,  and 

^Va+K) 

function  of  the  standard  Gaussian  distribution. 

Several  decision  fusion  rules  that  require  different  degrees  of  a  priori  knowledge  have  been 
proposed  in  [B.  Chen  2004],  [Niu  2006].  We  summarize  the  test  statistics  for  a  few  of  them 
here. 


Q(x)-^-j — exp(-?2  !2)dt  is  the  complimentary  distribution 


1.  Chair- Varshney  Fusion  Rule 


^3=1 

k= 1 


0  PfAk  )Ppk 

(1  ~  ^Dk  )PpAk 


I(xk) 


where  /(•)  is  an  indicator  function 


m= 


x  >  0 
x  <  0 


(7.11) 


2.  Equal  Gain  Combining  (EGC)  Fusion  Statistic 


&  k= 1 


(7.12) 


3.  Likelihood  Ratio  Test  Based  on  Channel  Statistics  (LRT-CS) 

These  test  statistics  are  based  on  the  knowledge  of  channel  statistics  and  local  detection 
performance  indices 


TH  |  1  T  v  U 

r*-Slo,ra 


1  +  \[2nakxk  exp {a2kx2k  /  2 )Q(-akxk ) 


k^k 

2V2 


nakxk  exp {a2x2  /2 )Q(akxk) 


(7.13) 
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It  has  been  shown  that  although  73  is  near  optimal  when  the  channel  SNR  is  high,  it  suffers 
significant  performance  loss  at  low  to  moderate  channel  SNR.  However,  as  shown  in  the  next 
section,  the  detection  performance  of  73  can  be  improved  by  adding  independent  SR  noise. 

7.3  Noise  Enhanced  Decision  Fusion 

We  now  consider  two  examples  to  demonstrate  the  possible  SR  effect  is  decision  fusion. 
Consider  the  first  decision  fusion  approach  where  an  estimate  of  Wk  is  obtained  before  being  sent 
to  the  fusion  center.  Here,  two  sensors  are  involved  in  the  system.  For  sensor  1,  we  assume  that 
Pdi  =  0.8,  Pfai  =  0.1  and  the  detection  performance  for  sensor  2  is  Pm  =  0.95,  PFA2  =  0.05.  We 
further  assume  that  channel  one  is  a  perfect  channel  while  channel  2  is  a  noisy  channel  with 
crossover  error  probabilities  ot2  =  P2  =  1/3.  In  practice,  this  may  depict  a  scenario  in  which  sensor 
I  is  located  far  from  the  signal  source,  but  close  to  the  fusion  sensor.  On  the  other  hand,  sensor  2 
may  be  located  close  to  the  signal  source,  but  far  from  the  fusion  center.  Therefore,  at  the  fusion 
center,  the  detection  performance  of  the  second  sensor  is  actually  equivalent  toP^2=  0.65  and 

PpA2-  0.35.  The  detection  performance  of  fusion  rules  71  and  y2  is  shown  in  Fig.  7.4.  Clearly, 

due  to  the  performance  loss  in  the  noisy  channel,  71  is  no  longer  the  optimum  fusion  rule  and  its 
detection  performance  is  degraded.  In  order  to  improve  the  performance  of  71,  we  add  SR  noise 
to  the  observed  data  x2  to  obtain  a  new  data  sample  y2.  Since  x2  is  a  discrete  random  variable, 
we  use  the  noisy  binary  channel  model  with  crossover  probabilities  asR  and  Psr  to  generate  the 
new  noisy  SR  data  sample  y2.  Thus,  the  procedure  here  is  to  utilize  the  cross-over  error 
information  in  the  enhancement  procedure.  Specifically,  X2  is  observed  at  the  fusion  center.  The 
decision  X2  is  either  retained  or  changed  depending  upon  the  outcome  of  a  comparison  of  a 
uniformly  distributed  random  variable  w  with  some  specified  values  asR  and  PSr.  If  X2  is 
observed  as  a  zero  and  w  <  cxsr,  then  x2  is  changed  to  a  one.  Conversely,  if  x2  is  observed  as  a 
one  and  w  <  pSR,  then  x2  is  changed  to  a  zero.  It  is  interesting  to  note  that  the  randomized 
procedure  actually  introduces  additional  errors  in  sensor  2,  but  places  more  emphasis  on  sensor  1 
with  the  negligible  channel  cross-over  errors. 

The  fusion  performance  of  71  using  the  new  data  sample  y2  is  also  plotted  in  Fig.  7.4.  When 
asR  =  0  and  Psr  =  0.5,  a  higher  Pd  for  this  SR  modified  fusion  system  is  observed  for  Pfa  e 
[0.07,  0.35]  when  compared  to  the  original  71.  A  similar  effect  is  also  observed  for  the  parameter 
setting  with  asR  =  0.5  and  Psr  =  0.  Furthermore,  it  can  be  shown  that  performance  enhancement 
for  the  shaded  region  in  Fig.  7.4  is  possible  by  adding  suitable  SR  noise. 

In  the  next  example,  in  order  to  examine  the  possible  SR  effect  in  decision  fusion  for  a 
wireless  sensor  network,  we  choose  the  number  of  sensors  to  be  K  =  8,  with  Pok  =  0.5  and  PFAk  = 
0.05  for  each  sensor.  The  SR  noise  n  is  chosen  to  be  a  DC  value  of  A;  i.e.,  instead  of  using  the 
original  data  Xk  to  perform  decision  fusion  using  73,  new  SR  modified  data  yk  =  Xk  +  A  is  used. 
Due  to  the  computational  complexity  of  this  detection  problem,  the  detection  performance 
evaluation  is  obtained  by  intensive  Monte  Carlo  evaluation.  Fig.  7.5  shows  the  deflection 
measures  [Picinbono  1995]  for  different  fusion  rules  with  SR  noise  A  =  -0.2  and  the  deflection 
measure  is  defined  as 
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(7.14) 


D(y)  [E(r\H0)-E(y\Htf 
Var{y\H0) 


Fig.  7.4  Detection  performance  comparison  of  different  fusion  rules  and  SR  noise. 

As  observed  in  Fig.  7.5,  for  most  values  of  channel  SNR,  by  adding  a  stochastic  resonance  noise 
n  =  A  =  -0.2  to  the  observed  data  x,  the  deflection  coefficient  is  improved.  An  interesting 
observation  is  that  when  channel  SNR  is  between  lOdB  and  20dB,  the  deflection  coefficient  of 
SR  modified  73  is  even  higher  than  that  of  LRT-CS.  However,  this  does  not  imply  that  the 
detection  performance  of  SR  modified  73  is  better  than  LRT-CS  since  these  test  statistics  are  not 
Gaussian  distributed. 


Fig.  7.5  Deflection  Coefficient  for  different  test  statistics,  A  =  -0.2. 
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For  a  fixed  channel  SNR  and  SR  enhanced  73  decision  fusion,  the  relationship  between 
different  values  A  and  deflection  coefficient  D  is  shown  in  Fig.  7.6.  As  A  starts  becoming 
negative,  the  deflection  coefficient  D  first  increases  and  then,  after  attaining  its  peak,  decreases 
as  A  decreases  further.  The  optimum  value  of  A,  namely  A  0,  which  maximizes  D  is  an  increasing 
function  of  SNR.  When  SNR  is  very  high,  we  have  Ao  ~  0  which  is  consistent  with  the 
conclusion  drawn  in  [B.  Chen  2004],  [Niu  2006]  where  the  asymptotic  optimum  of  73  is  derived; 
i.e.,  SR  noise  will  not  improve  performance  at  high  SNR. 


Fig.  7.6  Deflection  coefficient  for  SR  enhanced  73  decision  fusion  for  different  channel  SNR. 

Fig.  7.7  gives  the  ROC  curves  corresponding  to  different  fusion  statistics  at  channel  SNR  of 
5dB.  Clearly,  the  SR  modified  73  fusion  rule  provides  a  better  detection  performance  than  both 
the  EGC  and  the  original  73  rule. 

To  explain  this  SR  enhanced  detection  phenomenon,  we  first  obtain  the  relationship  between 
the  Rayleigh  fading  channel  model  and  the  binary  channel  model.  Corresponding  to  the 
transmission  channel  illustrated  in  Fig.  7.2,  we  have  for  73, 

«k  =  p(I(  yic)  =  1K  =  0),  (7. 1 5) 

and 

pk  =  p(I( yk)  =  0|Wk  =  1)  =  1  - p(I( yk)  =  1|  uk  =  1).  (7.16) 

From  (9)  and  (10),  after  some  calculation,  we  have 
p(J{ yk)  =  l|«k  =  1)  =/>((xk  +  A)  >  0|  wk  =  1) 
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From  (17)  and  (18),  it  can  be  shown  that  ak  monotonically  decreases  and  (3k  monotonically 
increases  as  A  decreases.  An  illustration  of  such  relationship  is  shown  in  Fig.  7.8  for  the  case  of 
channel  SNR  =  5dB.  Also,  from  (6)  and  (7),  it  can  be  shown  that  for  any  fixed  channel  SNR  and 
probability  of  false  alarm  Pfa,  the  probability  of  detection  PD  given  by  the  SR  modified  fusion 
rule  73  is  determined  by  the  crossover  error  probabilities  ak  and  pk,  k  =  1,  2,  ...,  K  which  are 
functions  of  A.  Therefore,  there  exists  a  suitable  A  which  yields  the  best  detection  performance, 

i.e.,  maximizes  the  Pd  for  a  given  Pfa-  When  A  =  0,  ak  =  pk  =  -  V  , - - .  When  SNR 

/  2v 1  +  2 <rk 

is  very  high,  ak,  pk  -»  0,  and  the  channel  Ck  becomes  a  near  perfect  channel.  As  a  result,  73 
becomes  a  near  optimum  fusion  rule. 


Fig.  7.7  ROC  curves  for  various  fusion  statistics;  SNR  =  5dB,  A  =  -0.2. 
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Fig.  7.8  The  equivalent  channel  crossover  errors  as  a  function  of  A,  SNR  =  5dB. 


7.4  Summary  of  Results  for  Distributed  Detection  using  SR 

In  this  section,  we  have  considered  the  detection  performance  of  distribute  detection  and 
fusion  systems  in  the  presence  of  non-ideal  transmission  channel.  For  fusion  of  decisions 
transmitted  over  channels  that  can  be  modeled  as  a  binary  channel,  we  showed  that  the  detection 
performance  of  some  decision  fusion  systems  can  be  improved  by  randomly  changing  the 
received  binary  signal,  i.e.,  by  adding  stochastic  resonance  noise.  For  the  problem  of  fusion  of 
decisions  transmitted  through  a  Rayleigh  fading  channel,  we  established  the  equivalence  between 
this  fading  channel  and  the  binary  channel  model  for  the  widely  used  two-stage  Chair- Varshney 
fusion  rule.  We  further  demonstrated  the  existence  of  the  SR  phenomenon  in  this  fusion  problem 
by  adding  a  discrete  DC  value  to  the  observed  signal  at  the  fusion  center.  A  significant 
improvement  of  detection  performance  is  reported  when  suitable  noise  is  selected.  Further 
results  including  an  adaptive  approach  to  learn  the  optimum  noise  vale  n  and  an  extension  of  this 
SR  effect  to  other  decision  fusion  systems  will  be  forthcoming. 
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8.0  Optimal  Decision  Processing  by  the  Transformation  Method 

For  decision  problems  whose  decision  region  is  suboptimal,  we  now  show  how  to  transform 
the  decision  statistic  to  recover  optimal  performance  [Kay,  2006a].  This  novel  methodology 
holds  considerable  promise  in  a  wide  range  of  application  areas  involving  decision  theory.  The 
procedure  simply  amounts  to  transforming  the  decision  statistic  to  yield  a  combined 
statistic/decision  region  which  is  optimal.  The  approach  may  be  thought  of  as  a  generalization  of 
the  stochastic  resonance  phenomenon,  which  employs  a  random  linear  transformation,  and  hence 
should  be  widely  applicable  to  many  practical  problems.  Maximization  of  the  probability  of  a 
correct  decision,  as  considered  here  for  example,  has  direct  applications  to  communication 
theory.  The  method  considered  here  treats  the  univariate  case.  Its  generalization  to  the 
multivariate  case  will  be  considered  in  future  work. 

8.1  Mathematical  Description  of  the  Transformation  Method 

A  mathematical  justification  of  the  approach  is  given  in  this  subsection  with  an  example  in 
the  next  subsection.  Here,  we  assume  that  the  problem  is  to  decide  between  two  hypotheses  Ho 
and  Hi  based  on  the  observed  scalar  test  statistic  x.  The  two  hypotheses  are  assumed  to  be 
random  events  with  prior  probabilities  of  no  and  m.  This  test  statistic  is  a  function  of  the  original 
data.  A  future  problem  will  address  the  extension  to  the  case  when  the  original  data  is 
accessible.  Based  on  the  observed  data  sample  x,  a  decision  rule  has  been  implemented  as 
follows: 


<j)(jc)=l  decide#!  (8.1a) 

<[)(x)  =  0  decide  #o-  (8.1b) 

This  decision  rule  is  assumed  to  be  suboptimal.  Denoting  the  probability  density  functions 
(PDFs)  as  p*(x)  and  pf  (x)  under  #b  and  #j,  respectively,  the  probability  of  a  correct  decision 

is 


Pc  =  *0  £  0  -  <f>(x))P%  ( x)dx  +  nx  £  <j>{x)p*  (x)dx 

=  n0  +  £ ^(*) [n\ P\  M  ~ noPo  (*)]<&  (8-2) 

Note  that  unless  <|)(x)  =  1  for  all  x  such  that  that  nxpx  (x)  -  n0p*  (x)  >  0  and  zero  otherwise 
(which  is  the  optimal  decision  rule)  this  probability  will  not  be  maximized. 

Now  consider  that  we  transform  the  test  statistic  as  y  -  g(x)  using  some  function  g.  The 
function  is  assumed  to  be  piece-wise  monotonic  so  that  over  each  interval  of  a  finite  number  of 
disjoint  intervals  either  g'(x)  >  0  or  g'(x )  <  0  (the  prime  denotes  differentiation).  The 

transformed  test  statistic  y  is  then  input  to  the  decision  rule  <(>(•)  to  yield  (^(y).  This  is  in 
accordance  with  the  assumption  that  the  decision  rule  is  fixed  and  so  cannot  be  changed.  Only 
the  test  statistic  can  be  modified.  We  will  see,  however,  that  this  is  mathematically  equivalent  to 
modifying  the  decision  rule.  To  do  so  note  that  Pc,  which  is  now  based  on  y,  is 
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Pc=  7T0+  £  </>{y)  [n{p\  (y)  -  n0pl  (y)] dy . 


(8.3) 


We  now  utilize  the  piece-wise  monotonic  assumption  of  g(x)  to  write  the  real  line  as 
R  =  u*l /.  =  u*,(a(,6,.),  where  a1<b1<a2<b2...<aN  <bN  and  the  intervals  are  open.  The 

function  g(.)  is  monotonic  over  each  interval  /,.  The  omission  of  a  finite  number  of  points  from 
the  integral  will  not  affect  the  results  as  long  as  the  PDFs  do  not  contain  any  impulses  at  these 
points  (or  cumulative  distribution  function  is  continuous  over  all  of  R).  Now  we  have  that  by 
defining 


j  ifg'CO-0 

'  I M  ifg'(/,)<0 

and  using  a  change  of  variables  from  y  to  g(x) 

Pc  =  n0  +  \</>{g{x))[nxpYx  (g(x)) - n^pl (g(x))] g'(x)dx 

=  71  *  +S  f,  ^(g(*))[*i/>r  (g(x))~*oPo  (g(x))]g'(x)dc . 

/= 1  ' 


(8.4) 


(8.5) 


Note  that  for  the  intervals  for  which  g'(x )  <  0,  we  have  J,  =  (6,,a,) .  Absorbing  the  negative  sign 
into  g'(x)  for  the  monotonically  decreasing  function  intervals  yields 


/Wo  +  Xf  [n,p\  (g(x))  -  n0pY0  (g(x))]  |  g\x)  \  dx .  (8.6) 

i=i  ‘ 

Next  we  recognize  that  p\  (g(x))  |  g'(x)  \-  p*  (x)  and  pi  (g(x))  |  g'(x)  |=  p;J  (x) ,  so  that  we 
finally  have 


Pc- *0+11  " noPo  (*)] dx 

1=1  ' 

=  + £  t(g(x))  [n\Px  w  -  (*)]  ^  • 


(8.7) 


We  now  have  that  the  probability  of  a  correct  decision  is  based  on  <j>{ g(x)).  In  effect  by 
transforming  the  decision  test  statistic  x  to  g(x)  we  have  been  able  to  effectively  modify  the 
decision  region.  It  is  clear  from  (8.7)  that  for  optimal  performance  we  must  have 


</*(x)  =  <f>{g{x))  = 


if  n\P\  (x)  >  noPo  (*) 
otherwise 


(8.8) 
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where  </>  *  (x)  denotes  the  optimal  decision  rule  based  on  x.  In  composite  function  notation,  we 
require  for  optimality  that 

</>agi.x)  =  <f>{g{x))  =  <j>*{x).  (8.9) 


We  need  only  determine  the  function  g(.).  We  provide  an  example  in  the  next  section. 


8.2  An  Example  of  the  Transformation  Method 

We  now  consider  a  very  simple  example,  for  which  the  solution  is  obvious.  The  example  is 
that  of  deciding  between  x  ~  5V(0,1)  under  Ho  and  x  ~  JV(1,1)  under  H,  where  5V(p,a2)  denotes  a 
Gaussian  PDF  with  mean  p  and  variance  a2.  The  prior  probabilities  are  7io  =  7ti  =  !4  We  assume 
that  the  suboptimal  decision  rule  is  to  decide  H\  if  x  >  0  and  decide  Ho  if  x  <  0.  The  optimal 
decision  rule  for  this  problem  is  to  decide  H\  if  x  >  1/2  and  decide  Ho  if  x  <  ‘A  This  is  just  the 
maximum  likelihood  decision  rule  [Kay  1998].  The  suboptimal  decision  rule  produces  the 
correct  decision  for  all  x  not  in  the  interval  [0,  1/2).  It  is  clear  now  that  to  modify  the  suboptimal 
decision  rule  to  make  it  optimal  we  need  only  map  the  values  of  x  in  the  interval  [0,1/2)  into  any 
other  interval  for  which  the  suboptimal  decision  rule  will  produce  a  zero  at  its  output.  For 
example,  we  could  use 


g(x)  = 


for  x  >  1/2  and  x  <  0 
for  0  <  x  <  1/2 


(8.10) 


Note  that  the  effect  of  the  transformation  is  to  do  nothing  (g(x)  =  x)  if  the  test  statistic  value  will 
produce  the  correct  decision.  However,  in  the  interval  [0,1/2)  the  decision  is  incorrect.  To 
convert  it  to  a  correct  decision,  we  simply  negate  the  value  of  the  test  statistic  as  g(x)  =  -x.  Then, 
the  values  0  <  x  <  'A  become  negative  and  are  decided  to  correspond  to  Ho  in  accordance  with  the 
suboptimal  decision  rule.  Finally,  it  should  be  observed  that  the  function  chosen  is  piece-wise 
monotonic  (as  well  as  discontinuous). 
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Figure  8.1  Transforming  Function  -  One  of  many  possibilities. 

9.0  Recommendations  and  Future  Considerations 

9.1  Extensions  to  the  Optimized  SR  Detection  and  Estimation  Framework 

a.  Optimized  SR  Detection  Framework  for  the  Unknown  PDF  Problem 

In  our  Phase  I  effort,  as  discussed  earlier,  we  have  addressed  the  problem  of  using  SR  to 
improve  detection  performance  for  the  case  when  the  probability  density  functions  (PDFs)  under 
the  two  hypotheses  are  known.  In  future  work,  the  emphasis  must  now  be  placed  on  the  problem 
of  adaptive  learning  and  training  of  SR  modified  detectors  in  the  unknown  PDF  case.  For  any 
specific  test,  we  assume  that  the  prior  probabilities  of  both  hypotheses  are  known  and  that  we 
have  the  knowledge  of  the  model  of  this  detection  problem.  Therefore,  given  a  set  of  testing 
samples,  the  underlying  PDFs  for  the  two  hypotheses  detection  problem  may  be  determined  by 
statistical  learning  methods  [Vapnik  1996].  Furthermore,  this  learned  PDF/CDF  can  be 
employed  further  to  construct  a  better  detector  or  to  improve  the  detection  performance  of  the 
detector  obtained  by  the  SR  approach  introduced  in  two  of  our  papers  [Chen,  et.  al.,  2006a,  b] 
developed  during  Phase  I.  Other  model  based  learning  algorithms  such  as  the  Ozturk  algorithm 
[Ozturk,  1992]  developed  at  Syracuse  University  (used  to  learn  the  distribution  model)  and  the 
EM  algorithm  (used  to  learn  the  parameters  of  a  certain  model)  shall  also  be  explored.  The 
information  extracted  from  data  via  these  methods  will  be  further  utilized  to  determine  the  form 
of  the  optimum  SR  noise  distribution  for  the  detection  problem  under  consideration. 

b.  Variable  Decision  Functions 

In  our  prior  work  in  Phase  I,  the  test  statistics  as  well  as  the  detection  threshold  were 
assumed  fixed.  Extension  of  the  SR  formulation  of  the  fixed  test  statistic  and  detection  threshold 
case  to  the  variable  threshold  and  variable  statistic  case  will  be  pursued.  Let  <D  denote  the  set  of 
all  possible  test  statistics,  $  be  the  set  of  all  possible  thresholds  and  IT  be  the  set  of  all  possible 
SR  noise  PDFs.  For  a  Neyman-Pearson  detection  approach,  we  know  that  for  any  fixed 
probability  of  false  alarm  Pfa,  its  corresponding  Pd  can  not  be  greater  than  1.  Thus,  there  must 
exist  at  least  one  (<t>,0,n;)e(<X>,{l,n)  that  maximizes  the  PD  with  false  alarm  rate  less  than  or  equal 
to  Pfa-  The  same  conclusion  can  also  be  drawn  for  the  Bayesian  case.  From  our  work  reported  in 
[Chen,  et  al.,  2006a]  and  [Kay,  et  al.,  2006],  we  have:  n  =  pn(x)  =  5(x-c)  for  the  Bayesian 
approach  and  7t  =  pn(x)  =  A.8(x-d)  +  (l-A.)5(x-f)  for  the  NP  approach.  The  next  step  of  our 
proposed  Phase  II  effort  in  this  area  is  to  establish  some  relationships  among  tj),  0  and  n  for  the 
variable  decision  function  and  to  further  simplify  the  procedure  to  find  the  best  SR  detection 
systems. 

c.  Optimum  SR  Noise  Considerations  under  Constraints 

In  some  practical  applications,  additional  constraints  on  SR  noise  may  be  applicable.  For 
example,  in  a  wireless  sensor  network  (WSN),  the  power  consumption  at  each  sensor  is  a 
concern.  In  this  case,  we  would  want  to  know  the  form  of  optimum  SR  noise  PDF  if  the  detector 
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or  sensor's  power  is  limited.  This  important  case  of  SR  noise  under  potential  constraints  will  be 
investigated. 

d.  Optimum  SR  Noise  for  the  Multiple  Hypotheses  Test 

In  our  prior  work  in  Phase  I,  we  focused  our  attention  on  binary  hypothesis  testing  problems 
with  considerable  success.  We  now  intend  to  extend  our  results  to  the  multiple  hypotheses 
testing  problem.  This  problem  is  encountered  in  classification  problems  in  various  applications 
such  as  target  identification,  medical  imaging,  and  remote  sensing.  As  is  well  known  from 
statistical  decision  theory,  due  to  the  complexity  of  the  problem  and  unlike  the  two  hypotheses 
test,  the  optimum  solution  of  a  multiple  hypotheses  test  is  very  difficult  to  determine.  The  form 
of  the  optimum  decision  regions  is  not  rectangular  in  general  and  in  many  cases  non-optimum 
rectangular  decision  regions  are  employed.  Therefore,  our  SR  approach  may  play  a  very 
important  role  in  this  case.  The  role  of  SR  has  not  been  investigated  in  this  context.  In  future 
analyses,  we  will  seek  to  determine  the  form  of  the  optimum  SR  noise  distributions  for  this 
important  case. 

e.  Robust  Nonlinear  Systems  and  Robust  SR  Noise 

In  the  Phase  I  study,  we  evaluated  the  detection  performance  of  certain  detectors  by  adding 
independent  SR  noise  provided  the  PDF  for  each  hypothesis  is  known  and  fixed.  In  the  next  step 
of  our  research,  a  set  of  nonlinear  systems  will  be  considered  and  their  robustness  subject  to 
small  variations  of  the  input  signals  will  be  evaluated.  The  relationship  between  the  nonlinear 
system  and  SR  noise  from  the  viewpoint  of  robustness  shall  be  explored. 

f.  SR  Enhanced  Signal  Estimation 

Dithering  related  techniques  (in  the  SR  sense)  have  been  widely  used  in  preserving  signal 
information  before  AC  to  DC  quantization  or  some  other  forms  of  transformations 
(preprocessing)  [Wannamaker,  2000],  The  quantized/transformed  signal  is  then  used  as  the  input 
to  various  signal  processing  systems  (post  processing).  However,  due  to  the  fact  that  many 
signal  processing  systems  are  nonlinear,  the  quantized/transformed  signal  may  not  be  the  optimal 
input  for  certain  kinds  of  systems.  This  is  mainly  due  to  the  following  two  reasons:  1.  The  SR 
noise  in  the  preprocessing  part  may  not  be  optimum.  2.  There  may  exist  a  mismatch  between  the 
SR  noise  modified  preprocessed  signal  and  the  signal  processing  system  that  follows;  i.e.,  the  SR 
noise  and  its  related  preprocessed  signal  may  be  the  optimum  signal  for  one  post  processing 
system,  but  may  not  be  optimum  for  another.  Thus,  it  is  a  very  important  and  interesting 
problem  as  to  how  to  tune  the  preprocessed  signal  again  by  adding  SR  noise  or  by  some  other 
methods.  As  an  example,  in  some  preliminary  work,  we  have  demonstrated  the  existence  of  such 
an  SR  phenomenon  in  a  system  where  maximum  likelihood  estimation  (MLE)  is  applied  to 
estimate  the  signal  parameter  from  the  1-bit  quantized  data.  This  notion  will  be  investigated  in 
more  detail  for  a  wider  class  of  problems  and  a  systematic  theory  for  estimation  problems  will  be 
developed  just  as  we  have  initiated  the  theoretical  formulation  for  detection  problems  in  our 
Phase  I  work. 

g.  Numerical  Approaches  to  Optimum  SR  Noise  PDF  Determination 
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Numerical  optimization  approaches  will  be  pursued  to  obtain  an  optimal  solution  of  SR  noise 
PDF.  As  shown  in  our  previous  results,  the  theoretical  optimum  SR  noise  PDF  is  shown  to  be  a 
single  impulse  for  the  Bayesian  approach  and  two-peak  noise  for  the  NP  approach  subject  to  a 
constant  false  alarm  rate  (CFAR)  constraint.  It  is  relatively  easy  to  obtain  the  solution  for  the 
single  dimensional  problem.  However,  for  higher  dimensional  problems,  it  is  much  more 
difficult  to  find  the  solution.  Thus,  a  set  of  numerical  approaches  including  but  not  limited  to 
Genetic  Algorithm,  Simulated  Annealing  and  Particle  Swarm  Optimization  techniques  need  to  be 
examined  for  potential  use  for  this  problem.  We  plan  to  conduct  a  study  and  characterize  the 
efficacy  of  different  optimization  algorithms  for  SR  problems. 

h.  Nonlinear  Stochastic  Resonance 

The  concept  of  SR  will  be  extended  to  a  much  broader  context.  So  far,  we  have  restricted 
ourselves  to  the  additive  form  of  SR  noise;  i.e.,  we  have  only  considered  y  =  x  +  n.  Here,  we 
will  consider  the  case  where  the  noise  can  actually  take  any  functional  form,  i.e.,  we  will  extend 
it  to  the  more  complicated  case  where  y  is  a  function  of  input  x  and  noise  n,  such  that  y  =  f(x,n). 
For  example,  we  may  consider  a  SR  noise  modified  observation  model  to  be  multiplicative,  i.e., 
y  =  n*x.  Further,  instead  of  considering  only  one  source  of  noise,  multiple  noise  sources  can  be 
considered.  For  example,  we  could  have  a  multiplicative  part  and  the  other  an  additive  part.  One 
such  kind  of  noise  model  is  applicable  when  a  signal  is  transmitted  over  a  Rayleigh  fading 
channel  where  y  =  ni*x  +  n2.  The  detection  performance,  the  optimum  solution  and  the 
performance  bounds  for  these  more  general  SR  schemes  will  be  investigated  with  improved 
performance  anticipated. 

i.  Detection  Enhancement  in  the  Presence  of  Correlated  Non-Gaussian  Noise 

Further  consideration  should  also  be  given  to  performance  improvement  to  be  realized  by 
SR  applied  to  nonparametric  detectors  for  the  problem  of  signal  detection  in  correlated  non- 
Gaussian  noise  and  additive  white  Gaussian  noise.  Here,  the  bimodal  models  considered  in 
Phase  I  shall  be  generalized  to  the  multi-modal  compound  Gaussian  model.  Further,  a  very 
general  class  of  non-Gaussian  processes  known  as  Spherically  Invariant  Random  Processes 
(SIRP)  [Yao,  1973]  shall  be  considered.  The  theory  of  SIRPs  provides  an  elegant  and 
mathematically  tractable  approach  for  the  generation  of  multivariate  non-Gaussian  PDFs.  Issues 
such  as  detection  performance  robustness  and  estimation  efficiency  (i.e.,  sample  support  size)  are 
essential  for  these  analyses. 

There  are  two  types  of  models  for  correlated  non-Gaussian  processes:  (1)  the  endogenous 
model  and  (2)  the  exogenous  product  model  [Conte,  1987].  For  the  endogenous  model,  the 
desired  non-Gaussian  PDF  and  correlation  function  is  realized  using  a  zero  memory,  non-linear 
transformation  on  a  real  Gaussian  process.  In  this  approach,  however,  it  is  not  possible  to 
control  both  the  PDF  and  the  correlation  independently  [Rangaswamy,  1993].  Further,  the 
nonlinear  transformation  may  give  rise  to  non-Gaussian  processes  with  a  non-negative  definite 
covariance  matrix.  For  the  exogenous  model,  however,  the  desired  non-Gaussian  process  is 
generated  by  the  product  of  a  Gaussian  random  process  and  an  independent  non-Gaussian 
process  which  can  be  correlated.  Thus,  the  PDF  and  correlation  can  be  independently  controlled. 
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This  feature  provides  an  important  capability  for  both  detection  and  estimation  performance 
assessments.  Issues  such  as  detection  performance  robustness  and  estimation  efficiency  (i.e., 
sample  support  size)  are  of  prime  interest  for  these  analyses. 

9.2  Optimal  Decision  Processing  by  the  Transformation  Method 

The  derivation  of  the  transformation  method  discussed  in  Section  4.0  will  be  extended  to  the 
multivariate  case  in  the  future.  Specifically,  the  more  general  case  of  data  vectors  of  length  N 
shall  be  considered.  This  will  allow  the  application  of  the  approach  to  the  more  general  problem 
when  multi-sample  data  is  available.  The  choice  of  the  g(*)  function  is  the  critical  issue  here.  In 
theory,  it  is  always  possible  to  choose  an  appropriate  function  to  implement  the  optimal 
mapping.  However,  in  practice,  if  we  are  within  a  high-dimensional  space,  it  may  not  be  obvious 
how  it  is  chosen.  We  plan  to  investigate  this  approach  as  well  as  a  possible  extension  to  the  case 
of  random  transformations.  The  usual  SR  method  may  be  thought  of  as  a  random  transformation 
that  is  effected  by  adding  a  random  variable  to  the  data.  However,  the  restriction  to  an  addition 
means  that  the  transformation  is  constrained  to  be  linear.  Clearly,  this  is  so  restrictive  as  to 
impede  the  possible  implementation  of  an  optimal  decision  rule.  Our  approach  will  alleviate  this 
restriction. 

9.3  Stochastic  Resonance  in  Imagery 

a.  Image  Quality  Metrics 

As  noted  previously,  important  image  processing  considerations  involve  image  noise 
consisting  of  (a)  random  image  noise  such  as  impulsive  and  additive  noise  as  well  as  (b) 
structural  noise  consisting  of  artifacts  in  the  imagery.  Another  important  consideration  is  the 
degradation  of  image  sharpness  due  to  blurring,  ringing,  and  blocking  effects.  The  advantages  of 
the  image  quality  metrics  is  that  they  are  able  to  dynamically  monitor  and  adjust  image  quality, 
optimize  parameters  and  algorithms,  and  benchmark  image  processing  systems  and  algorithms. 

Future  work  should  further  consider  the  image  metrics  noted  in  Section  6.1.  We  shall  utilize 
the  appropriate  metrics  to  assess  improvement  of  the  filtered  image  quality  and  determine  the 
best  parameters  for  the  SR  based  approaches  as  in  cases  such  as  the  median  filtered  image 
discussed  in  Section  7.0.  By  doing  so,  automatic  filtering  algorithms  will  be  developed  and  it  is 
anticipated  that  performance  will  be  improved  over  current  state-of-the-art  methods. 

b.  Noise  Reduction 

In  our  prior  work  in  Phase  I,  it  has  been  shown  that  SR  can  help  to  improve  the  noise 
filtering  performance  in  image  processing.  In  particular,  in  [Chen,  et.  al.,  2006d],  we  have  shown 
that  SR  noise  can  improve  the  performance  of  median  filtering.  This  provides  us  the  motivation 
to  apply  SR  to  a  much  broader  class  of  image  filtering  and  restoration  problems  for  improved 
human  visualization.  In  unreported  work,  we  have  further  demonstrated  the  existence  of  the  SR 
effect  for  an  image  corrupted  by  Gaussian  noise.  Our  preliminary  experiments  show  that  the 
performance  of  Wiener  filters  improved  when  suitable  SR  noise  is  added.  This  is  very 
encouraging  since  Wiener  filters  are  optimal  for  the  Gaussian  noise  case  with  a  stationary  signal. 
The  reason  for  such  improvement  is  the  fact  that  the  assumption  of  stationarity  does  not  usually 
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hold  in  images  even  for  a  small  local  region  and  almost  all  images  are  non-stationary,  i.e.,  their 
content  varies  depending  on  location.  In  the  future,  we  should  focus  on  the  derivation  of  the 
optimum  solution  of  the  SR  noise  PDF  for  a  large  class  of  image  enhancement  problems. 
Emphasis  should  be  placed  on  the  development  of  efficient  algorithms,  possibly  including  an 
adaptive  self-learning  algorithm,  to  find  the  optimum  SR  noise  for  image  filtering  applications. 

c.  Image  fusion 

Image  fusion,  an  important  image  processing  technique  to  integrate  useful  information  from 
different  input  images,  has  been  widely  used  in  a  number  of  applications  such  as  remote  sensing 
and  medical  image  processing.  Among  all  of  the  image  fusion  algorithms,  wavelet  based  multi¬ 
resolution  image  fusion  is  the  most  popular  and  powerful  approach.  One  critical  and  difficult 
aspect  in  such  an  approach  is  to  determine  the  fused  image  wavelet  coefficient  values.  A  number 
of  algorithms  have  been  developed  to  tackle  this  problem.  Future  work  should  focus  on  how  to 
use  SR  to  improve  image  fusion  quality,  namely,  how  the  SR  noise  can  play  a  role  to  better 
select  the  wavelet  coefficients  as  well  as  to  determine  the  suitable  neighborhood  for  an  improved 
fused  image.  The  image  quality  will  be  evaluated  based  on  image  quality  metrics  (see  Section 
6.1)  that  we  have  developed  based  on  the  human  visualization  system.  Thus,  a  SR  based  image 
fusion  framework  will  be  developed  and  its  performance  evaluation  will  be  carried  out. 

d.  Target  Detection  and  Identification  in  Images 

An  important  future  consideration  for  SR  enhanced  detection  is  the  extension  of  the  current 
techniques  from  single  pixel  detection  (the  circular  image  of  Fig.  6a)  to  more  general  target 
detection  based  on  image  data.  One  immediate  application  is  to  use  SR  to  enhance  template 
matching  for  target  detection,  i.e.,  given  a  grey/color  image  I  and  a  target  template  X  (for 
example,  a  face,  a  building  or  a  military  target,  the  template  itself  may  be  a  simple  binary  image, 
a  grey  scale  image,  or  a  color  image),  the  goal  will  be  to  search  for  the  template  in  the  entire 
image.  To  find  the  existence  and  position  of  X  in  the  given  noisy  image  Y,  a  template  matching 
technique  is  often  used;  i.e.,  moving  X  on  Y  and  attempting  to  find  the  peak  of  a  similarity 
metric  such  as  the  correlation  function  or  mutual  information.  This  is  similar  to  matched 
filtering  in  a  single  dimensional  problem.  As  Y  often  appears  to  be  a  grey  image,  one  would 
expect  some  error  and  considerable  computational  burden  as  we  try  to  find  the  template  X  in  the 
image  Y  directly  when  Y  is  a  noisy  image  and/or  the  matching  algorithms  are  sensitive  to  the 
small  variations  in  the  grey  levels.  The  problem  is  actually  similar  to  the  image  registration 
problem  in  which  Syracuse  University  has  considerable  experience.  Therefore,  in  this  problem, 
quantization  (or  segmentation)  of  Y  may  become  a  very  important  preprocessing  step. 
Furthermore,  we  would  expect  the  SR  effect  (both  in  quantization  and  after  quantization)  to 
occur  in  a  similar  manner  as  in  SR  enhanced  median  filtering.  The  potential  for  improvement 
may  be  significant.  A  variety  of  applications  can  be  pursued  within  this  framework.  These  might 
include  face  recognition,  image  registration  (to  improve  the  performance  via  registration  after 
segmentation)  and  may  even  be  applicable  to  the  single  dimensional  problem. 
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9.4  Stochastic  Resonance  in  Image  Visualization 

a.  Enhancement  of  RGB  Imagery 

In  our  prior  work  in  Phase  I,  it  has  been  shown  that  the  quality  of  the  thresholded  ‘Lenna’ 
image  can  be  improved  by  adding  a  suitable  two-peak  noise.  In  future  studies,  the  optimal  noise 
for  a  noisy  image  should  be  determined  where  the  reference  image  is  no  longer  available.  To 
better  utilize  the  spectral  as  well  as  the  spatial  information,  some  image  models  such  as 
multiple-dimensional  Markov  Random  Field  models  will  be  developed  to  better  describe  the 
correlations  between  R,  G  and  B  (red,  green,  blue)  bands  to  further  improve  the  image  quality. 
Performance  evaluation  as  well  as  adaptive  learning  algorithms  will  be  pursued. 

b.  Multi-  and/or  Hyperspectral  Imagery 

In  this  area,  the  potential  of  improving  Remote  Sensing  (RS)  detection  performance  as  well 
as  classification  accuracy  via  SR  will  be  explored.  Our  preliminary  results  show  that  suitable 
Gaussian  SR  noise  can  improve  the  accuracy  of  the  Gaussian  Maximum  Likelihood  Classifier 
(GMLC).  The  role  of  SR  in  this  area  will  be  further  examined  utilizing  the  extensive  techniques 
developed  at  Syracuse  University  [Varshney  and  Arora,  2004],  Possible  enhancement  of  other 
processing  algorithms  by  adding  SR  noise  will  also  be  evaluated.  The  form  of  suitable  SR  noise 
for  specific  algorithms  will  be  derived. 

c.  Hyperspectral  Image  Visualization 

Principal  Component  Analysis  (PCA)  and  Segmented  PCA  (SPCA)  based  image  visualization 
techniques  [Tsagaris,  2005]  are  widely  used  to  condense  the  information  contained  in 
hyperspectral  images.  However,  due  to  the  non-Gaussian  nature  of  the  hyperspectral  images, 
PCA  and  SPCA  may  not  fully  utilize  the  hyperspectral  imagery  information.  Moreover,  PCA 
based  techniques  may  introduce  some  artifacts.  In  general,  a  universal  optimal  method  to  display 
the  contents  of  the  hyperspectral  images  does  not  exist.  Therefore,  in  future  analyses,  the 
potential  enhancement  by  SR  in  various  visualization  schemes  will  be  pursued  and  a  better 
visualization  result  is  expected. 

d.  Enhancement  of  Region  of  Interest  (ROI) 

Determination  of  the  region  of  interest  (ROI)  is  an  essential  part  of  many  applications  such 
as  the  determination  of  possible  location  of  tumors  in  medical  images  and  the  possible 
target/incidents  in  video  surveillance.  For  these  types  of  problems,  description  of  the  statistical 
properties  of  the  ROIs  to  specify  the  exact  form  of  the  target  distribution  is  generally  impossible, 
although  some  features  may  be  available.  In  future  studies,  SR  based  techniques  should  be 
developed  to  better  extract  the  features  from  the  image  and  determine  the  location  and  size  of  the 
ROIs.  This  approach  will  have  a  very  broad  application  in  medical  image  processing. 
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9.5  Visual  Image  Fusion  Considerations  for  Human  Perception 

The  rapid  development  in  imaging  and  computing  technology  has  fostered  utilization  of 
visual  information  for  situation  assessment  and  decision  making.  Multiple  image  devices  of 
either  the  same  or  different  modalities  are  used  to  capture  images.  These  images  are  then  fused  to 
integrate  each  image’s  information  content  to  render  a  single  image  of  enhanced  quality.  As 
image  fusion  techniques  become  available,  evaluation  of  their  performance  is  of  high  interest. 
However,  there  is  a  requirement  for  specific  metrics  to  evaluate  the  quality  of  fused  images  and 
its  affect  on  the  human  vision  system  (HVS).  This  problem  has  been  addressed  at  Syracuse 
University  [Chen  and  Varshney,  2006],  For  the  current  proposal,  such  algorithms  may  be 
pertinent  to  the  fusion  of  multiple  images  each  containing  an  IID  realization  of  added  noise. 

Comparative  evaluation  of  fused  images  is  a  critical  step  to  evaluate  the  relative 
performance  of  image  fusion  algorithms.  Human  visual  inspection  is  often  used  to  assess  the 
quality  of  fused  images.  Here,  we  discuss  some  variants  of  a  new  image  quality  metric  based  on 
the  human  vision  system  (HVS).  These  measures  evaluate  the  fused  image  quality  by  comparing 
its  visual  differences  with  the  source  image  thus  requiring  no  ground  truth  knowledge.  First,  the 
images  are  divided  into  several  local  regions.  These  regional  images  are  then  transformed  to  the 
frequency  domain.  Second,  the  difference  between  the  transformed  local  regional  images  is 
weighted  with  a  human  contrast  sensitivity  function  (CSF).  The  local  regional  image  quality  is 
obtained  by  computing  the  mean  square  error  (MSE)  of  the  weighted  difference  of  the  images 
obtained  from  the  fused  regional  image  and  source  regional  image.  Finally,  the  quality  of  a 
fused  image  is  the  weighted  summation  of  the  local  regional  image  quality  measures.  Our 
experimental  results  show  that  these  metrics  are  consistent  with  perceptually  obtained  results. 

The  design  of  a  universal  objective  measure  for  image  quality  is  difficult  due  to  the  large 
variety  of  image  fusion  applications.  Often,  ground  truth  images  are  not  available  so  that  image 
quality  evaluation  is  complicated.  Since  humans  are  the  final  users  of  fused  images,  human 
visual  inspection  is  often  used  as  the  quality  measure.  However,  human  inspection  may  not 
always  be  possible  due  to  large  input  and  output  data  sizes  and  associated  financial  costs.  Several 
automatic  image  evaluation  algorithms  have  been  developed  recently.  A  performance  measure 
for  pixel-level  fusion  performance  that  compares  the  edge  information  of  fused  images  with  edge 
information  of  input  images  has  been  proposed  [Xydeas,  2000].  Later,  it  was  used  to  calculate 
the  affect  of  noise  on  image  fusion  [Petrov,  2003].  Mutual  information  was  also  employed  to 
evaluate  fusion  performance  [Qu,  2002],  Wang  and  Shen  [Wang,  2003]  proposed  a  quantitative 
correlation  analysis  method  to  evaluate  hyperspectral  image  fusion  algorithms.  [Piella,  2004] 
has  proposed  some  new  quality  measures  based  on  an  image  quality  index  proposed  by  Wang 
and  Bovik  [Wang,  2002],  However,  there  is  no  established  direct  relationship  between  these 
evaluation  measures  and  the  real  perceptual  results  of  humans. 

A  number  of  linear  and  nonlinear  models  have  been  proposed  to  simulate  the  response  of  the 
HVS.  Use  of  a  contrast  sensitivity  function  (CSF)  to  modify  the  difference  of  two  images  in  the 
frequency  domain  is  popular.  However,  as  the  CSF  is  only  applied  in  the  frequency  domain,  it 
does  not  include  the  nonlinear  local  spatial  information  such  as  local  luminance  and  contrast. 
However,  local  spatial  information  is  very  important  in  image  fusion  because  it  strongly  relates 
to  the  image  content  that  is  transferred  from  input  images  to  the  fused  image.  Here,  we  propose 


65 


a  scheme  to  compare  the  quality  of  different  fused  images  by  comparing  them  with  input 
images  based  on  both  types  of  local  information.  These  are  given  by  the  salience  of  a  set  of 
localized  windows  and  the  difference  in  the  frequency  domain  filtered  by  CSF.  Compared  to 
other  measures,  our  proposed  metrics  have  several  advantages.  First,  we  introduce  this 
methodology  to  evaluate  the  image  fusion  performance.  Second,  in  the  conventional  CSF  based 
methods,  the  entire  image  is  considered.  We  employ  CSF  based  methods  on  a  region-by-region 
basis.  This  is  more  suitable  for  the  fusion  application  because  one  should  examine  image  quality 
at  a  local  level  rather  than  at  a  global  level.  Moreover,  image  content  and  statistics  vary  over  an 
image  and,  therefore,  our  region-based  image  quality  measure  is  more  appropriate.  Finally, 
compared  to  other  methods  such  as  mutual  information  based  methods,  our  proposed 
methods  require  much  less  computation. 

We  illustrate  the  utility  of  several  image  quality  metrics  under  investigation  at  Syracuse 
University  (SU)  by  applying  them  to  evaluate  image  quality  using  three  fusion  schemes.  The 
first  is  the  ‘wavelet*  fusion  algorithm  where  the  input  images  are  decomposed  using  a  length 
four  Daubechies  wavelet  filter.  The  fused  image  coefficients  are  computed  by  choosing  the 
corresponding  input  image  coefficients  with  largest  wavelet  domain  amplitude  and  by  averaging 
the  coefficients  of  lowest  resolution.  The  number  of  decomposition  levels  is  two.  This  fusion 
algorithm  emphasizes  the  fused  image  edge  information.  The  second  algorithm  is  the 
‘ averaging  ’  method  consisting  of  the  average  of  the  input  images  on  a  pixel-by-pixel  basis.  The 
third  is  the  ‘Laplacian*  pyramid  based  method  where  the  input  image  is  decomposed  using  a 
Laplacian  pyramid  decomposition  and  the  fused  image  is  reconstructed  by  averaging  the  low 
resolution  components  and  selecting  the  coefficients  with  largest  amplitude  for  the  remaining 
coefficients. 

The  images  shown  in  Fig.  8.1  are  created  by  blurring  the  original  speckle  noisy  ‘Lenna’ 
image  (a)  of  size  256x256  using  a  10x10  mean  filter  twice  to  obtain  two  input  images  (b  and  c) 
to  be  fused.  Input  images  (b)  and  (c)  have  blurring  that  occurs  in  the  right  and  left  half  of  the 
images,  respectively.  In  Fig.  8. Id,  e,  and  f,  we  observe  that  the  ‘wavelet’  method  outperforms 
the  other  two.  Performance  results  for  10  repeated  trials  are  shown  in  Table  1.  These  include 
new  metrics  developed  at  Syracuse  University  denoted  by  the  first  three  superscripted  Q’s.  For 
low  SNR,  the  ‘averaging’  method  is  best,  while  at  high  SNR,  the  ‘wavelet’  method  is  superior. 
The  results  were  confirmed  by  seven  human  subjects.  They  preferred  the  ‘wavelet’  method  for 
SNR  >  25dB  and  the  ‘averaging’  method  for  SNR<25dB.  Details  of  the  new  performance 
metrics  are  considered  further  in  [Chen  &  Varshney  2006], 

In  Phase  II,  we  will  continue  our  consideration  of  new  perceptually  based  quality  metrics  for 
image  fusion  which  do  not  require  a  reference  image.  Compared  to  other  measures,  these 
metrics  are  easier  to  calculate  and  are  also  applicable  to  different  input  modalities.  Experimental 
results  show  that  our  quality  metrics  fit  the  results  of  human  visual  inspection  well  and  also 
correlate  well  with  other  measures.  Further,  our  proposed  metrics  can  be  used  to  determine  the 
fusion  performance  of  different  algorithms  with  different  noise  PDFs.  Several  extensions  to  this 
work  such  as  extending  our  input  images  to  include  multi-color  images,  hyperspectral  imaging, 
and  stochastic  resonance  effects  were  initiated  in  the  Phase  I  effort  and  will  be  continued  in 
Phase  II. 
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Fig.  9.1  Fused  images  for  the  speckle  noisy  'Lena'  image  set  with  different  SNRs.  The  left 
column  uses  the  'wavelet'  method,  the  center  column  uses  the  'averaging'  method,  and  the  right 
column  uses  the  'Laplacian'  method.  (a),(b),  (c):  fused  results  when  SNR  =  15dB;  (d),(e),(f): 
SNR  =  20dB;  (g),(h),(i):  SNR  =  25dB. 
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Table  9.1  Performance  results  for  visual  image  fusion  and  human  visualization. 
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9.6  Stochastic  Resonance  in  Distributed  Systems 

It  has  been  shown  in  our  preliminary  work  during  Phase  I  that  SR  plays  a  role  in  decision 
fusion  systems.  In  [Chen  et  al.,  2006e],  we  have  shown  the  existence  of  the  SR  effect  in 
distributed  detection  systems  for  the  two  hypotheses  detection  problem.  In  future  research,  we 
will  formulate  and  explore  the  general  framework  of  SR  in  distributed  detection  and  estimation 
problems.  The  role  of  SR  in  such  a  collaborative  context  forms  a  fascinating  research  area,  one 
that  has  never  been  considered. 

a.  SR  in  Distributed  Estimation 

In  distributed  estimation  systems,  the  data  is  compressed  (often  quantized)  at  the  sensor 
before  it  is  transmitted  to  the  fusion  center.  As  only  limited  information  is  transferred  from  the 
local  sensors  to  the  processing  sensor,  some  nonlinear  transformations  are  often  applied  to  the 
received  data.  The  role  of  SR  noise  in  such  applications  will  be  evaluated  and  the  possible 
estimation  accuracy  improvement  by  adding  SR  noise  will  be  examined. 

b.  SR  in  Distributed  Detection 

In  distributed  detection  problems,  Likelihood  Ratio  Tests  (LRT)  or  locally  optimal  detectors 
(LOD)  (when  signal  strength  is  weak)  are  often  used  at  the  local  sensors.  These  types  of 
detectors  are  optimum  when  each  sensor  is  conditionally  independent  [Varshney,  1996]. 
However,  in  many  practical  applications,  sensors  are  at  least  partially  dependent.  In  that  case, 
the  LRT  or  LOD  are  no  longer  optimum  detection  approaches.  Although  the  optimal  solution  for 
this  problem  may  be  solvable  when  the  full  information  about  all  the  sensors  is  available,  the 
design  of  each  corresponding  local  detector  without  this  knowledge  (which  is  the  desired  goal  of 
distributed  systems)  is  often  either  too  complicated  or  very  difficult  to  implement.  In  this 
proposed  study,  we  will  explore  the  possible  enhancement  by  adding  SR  noise  to  the  local 
sensors  utilizing  the  available  (partial  or  full)  information  available  about  the  local  sensors  to 
improve  the  local  detection  performance  without  altering  the  local  detectors.  Compared  to  the 
complicated  approaches  that  are  currently  being  investigated  in  the  distributed  detection 
literature,  our  SR  based  approach  may  provide  a  very  simple  and  effective  approach  for 
improved  detection  performance.  This  will  have  a  major  impact  on  application  areas  such  as 
vehicle  health  management  (VHM)  efforts  for  avionic  systems  currently  being  studied  at 
Syracuse  University  in  collaboration  with  NASA  Langley  Research  Center. 
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